Oh good, I had this penciled in to post but now I don't have to.
Why is nobody* asserting outright fraud here? R&R spent a long time refusing to release their data, then when nobody could replicate their results, they released data that gave the appearance of being cooked.
To put it another way: How would their behavior have been different if they'd been deliberately making stuff up? I think it might be wise to look at their other work.
*nobody that I've heard of.
I've seen accusations of fraud. If they did commit fraud, though, why did they ever release their data?
Never attribute to fraud what can be explained by simple incompetence.
2: because I think
I hadn't quite grasped the details of the story
covers most people and nearly every journalist. It's a somewhat complicated story about a very complicated issue that most people believe they can't, even if they try very hard, really begin to grasp. That said, it's a completely astonishing revelation and should be a very important news item. Instead, I suspect it will be grist for my facebook feed and a few blogs that I no longer read.
4: right. Cock-up is always more likely than conspiracy. Still, this one smells pretty bad.
4, 6: Well, not 'never'. When someone makes a mistake that's both impressively incompetent and really convenient for them, fraud remains a viable possibility.
I recently had an enormously difficult time convincing two physicists that the reason they got a shocking and exciting result that no one else had noticed in a field undergoing intense scrutiny was because of a software bug. Their replies to me went through several stages, including "yes, we see that that bug exists but it did not actually affect our results" and "no, really, in fact we have a deeper understanding of this result based on a purely pen-and-paper calculation" followed by "oh, yes, we agree with you that the the outcome of our pen-and-paper calculation is in conflict with basic principles of physics" and finally "huh, all along we were tripped up by a simple software bug". (Some of the intermediate stages evinced a fair amount of anger at me on their part, too.)
And yes, at some intermediate step I thought they must be simply lying about what they had done and that I might have to write an uncomfortable public note exposing the problem, before they finally acknowledged they had just made a stupid mistake and retracted their claim.
Oh man. My immediate reaction to this story was a feeling of terror in the pit of my stomach. I'm sure they're prevaricating asswipes but the idea of accidentally doing something like this in my research is totally haunting.
||
Is there someplace central with all of the current deets re unfoggedecathlon? The missus has consented and I've got the frequent flyer miles.
The missus has consented and I've got the frequent flyer miles.
New mouseover text?
If they did commit fraud, though, why did they ever release their data?
Because they were on the verge of being found out. Other folks had been trying and failing to replicate their data. If they release the data, they can hope that the Excel error goes unnoticed and that the other fudges are attributed to judgment differences - or, in the less-good-case scenario, that folks like Krugman are willing to give them the benefit of the doubt.
Seconding 10 and tidying my code (because that's less stressful than actually writing. Oops).
Or, like, BPhd's mouseover text seven years ago.
16 to 13! stupid refresh. stupid k-sky. stupid, stupid k-sky.
14: I don't think that can be it. The result was primarily driven by their judgements, rather than the Excel error. Their choices mean their result was entirely driven by New Zealand's GDP growth rate in 1951, which was highly negative. They deliberately left out New Zealand from 1946-1950, which were positive. They include 1950, or throw out 1951, and the result goes away. They also weighed it in a funny way, so that this single observation counts as much as all of the last 20 years of Japanese growth rates. In some ways, they're lucky that the Excel error got uncovered, since the other two are much easier to spin as deliberate fraud.
Where is the quality control?! (otherwise known as peer review...).
It's hard to believe that such omissions could be the result of honest error.
19: It was never peer reviewed. Tt was published in the AER Papers & Proceedings issue, which is just short un-peer-reviewed summaries of research presented at the AEA conference every January.
Honestly, the paper would never have survived peer review in its current form. It's just some summary statistics and graphs that any undergraduate could do (unlike R&R, apparently).
18: I'm not seeing where we have any substantive disagreement on the nature of their behavior - if the biggest problem wasn't in Excel (and that seems correct) that doesn't really change anything.
I have come to the conclusion that if rightwingers didn't have cherry-picked data, they wouldn't have any data at all.
I'm being a little more devil's advocaty than I mean to be. I think people are fully capable of bullshitting themselves into delusion, and would guess that this is what happened here. But still, self-deception on this level seems functionally indistinguishable from fraud.
I've seen the Reinhart-Rogoff error pointed out in many places over the last couple of days, but it hadn't been clear to me to what extent such heavyweights as, e.g., the IMF and the European Union had been relying on their paper in implementing or strongarming austerity measures. I see from the rortybomb piece that Paul Ryan and the Washington Post cited it as authoritative.
It was Dean Baker at the CEPR I originally saw linked to about this: he doesn't say much about who exactly has cited Reinhart & Rogoff to back up their policy prescriptions.
I find it pretty damn unsettling that major power-brokers might act on the basis of a single paper (or, I guess they wrote a book based on the paper as well), without vetting it carefully.
Definitely pit-of-stomach for me. I have a huge unwieldy Excel spreadsheet I've developed whose results' consequences will be rather momentous whatever they are.
Cripe, minivet. Um, set up cross-check columns and rows and whatnot?
I haven't worked with Excel in a serious way for a while, but is there a way to set up, I don't know, some calculation/formula that tells you "Yo! This calculation isn't doing what it's supposed to do!" I can do that to some extent with a database.
Maybe I'm spoiled, but I would never do large-scale manipulations in an Excel sheet, I would only process them programmatically (knime or pipeline pilot since I'm not really a programmer.) Excel is for things like budget models, not data analysis.
Yeah, I have a whole tab for gut-checks of various kinds. But errors in Excel tend to be elusive.
26: Well, it's not exactly data analysis - it's more of a simulation. Even more infeasible, unfortunately, to do it as code. (Excel has ecosystem advantages.)
Excel has ecosystem advantages.
tears bitter tears
.
.
.
For my purposes, Excel is for data entry, for which it's still better than anything but custom-rolling a form for each of my observation types. Although, given what a little bitch xL is about saving as CSV, the margin of better there is vanishing.
Well, ideally you would have someone able to put some time into cross-checking the work before you release it.
There are plenty of checks built in to most systems. But in a lot of cases, model complexity means that you can't have somebody easily check every piece of what you've done, so a lot of the checks come down to "does this seem like it makes sense?" In this case, I assume everybody involved thought in made perfect sense.
I happily don't really have to use Excel. I still do occasionally; it's pretty fast for some tasks.
Also, I worked at a place in a field where they only use Excel, and people had some pretty goddamned impressive deal-structuring worksheets.
For my purposes, Excel is for data entry
Why don't you use a database for that? I don't know what "custom-rolling a form" means.
I'm torn between the desire to point and laugh at R&R and the desire to kick them in the balls. I suppose the two are not really mutually exclusive.
On the OP, it's really amazing the number of awards Reinhart and Rogoff garnered for their book.
34:
Telling wealthy and powerful people what they want to hear seems like a profitable activity. Too bad about the being a terrible person bit.
You can take the profits and pay people to tell you you're a good person.
Wasn't there supposed to be a spreadsheet in development that showed which commenters had schtupped each other? I expect to receive a copy on a commemorative custom USB stick at the Unfoggedycon.
R&R's reply to the debunking is here.
I zoned out at the first paragraph (because I'm tired): people more qualified should assess their response.
They should be raked over the coals if warranted.
37- Yeah, but coding errors resulted in a report that say nosflow has slept with everyone, thus failing the "does this seem like it makes sense?" test.
2
Why is nobody* asserting outright fraud here? ...
Because this sort of thing is quite normal (see 8 10 15). You stop looking for mistakes when you get the results you expect. Feynman discusses this in Cargo Cult Science although the last sentence below (in my opinion at least) is far too optimistic.
We have learned a lot from experience about how to handle some of
the ways we fool ourselves. One example: Millikan measured the
charge on an electron by an experiment with falling oil drops, and
got an answer which we now know not to be quite right. It's a
little bit off, because he had the incorrect value for the
viscosity of air. It's interesting to look at the history of
measurements of the charge of the electron, after Millikan. If you
plot them as a function of time, you find that one is a little
bigger than Millikan's, and the next one's a little bit bigger than
that, and the next one's a little bit bigger than that, until
finally they settle down to a number which is higher.
Why didn't they discover that the new number was higher right away?
It's a thing that scientists are ashamed of--this history--because
it's apparent that people did things like this: When they got a
number that was too high above Millikan's, they thought something
must be wrong--and they would look for and find a reason why
something might be wrong. When they got a number closer to
Millikan's value they didn't look so hard. And so they eliminated
the numbers that were too far off, and did other things like that.
We've learned those tricks nowadays, and now we don't have that
kind of a disease.
This is utter moronity if truly a mistake. This column had about 25 values in it, not 1500. And they skipped 5 of them. You can SEE with your EYES the whole data set on one screen in front of you.
We work with small enough data sets that we can double-check to make sure we didn't miss any data points. (although so do Harvard economists which apparently doesn't help) The thing I worry about is some busybody saying that my sample size wasn't big enough to have "power" or something. "Power"? Look at the p-value! It's a p-value, isn't it? Get the hell outta here with your knob.
I prefer to work with data sets that have only two points, because my regressions are always a perfect fit.
19
Where is the quality control?! (otherwise known as peer review...).
Peer review is not intended to and normally will not catch this sort of error. Reviewers are not responsible for correctness. As reviewer you might for instance check that they used the appropriate statistical methods but not that they performed them correctly.
It's hard to believe that such omissions could be the result of honest error.
Not if you have some experience with how the sausage is actually made.
Peer review might not catch the excel error if they didn't submit the raw spreadsheet, but would absolutely have called out assumptions 1 and 2.
Further to 38: It seems like the WSJ response from R & R mostly refers to other papers.
By the way, I don't doubt that the fact that R & R are Harvard professors -- this seems noted in every single article -- has something to do with their presumed soundness. And, for a bit of cross-thread frisson, I suppose if we call them "Dr. Reinhart" and "Dr. Rogoff" their status will be clear to all.
Atypically, I'm not worried about data analysis now. Instead, I'm alone in a hotel room and lightly drunk. I need essear to tell me the next step.
Ned, can you look at the R&R response linked in 38 and say whether it makes any sense, on a first pass? It's fairly short.
But mostly 10 is how I feel. I once wasted thousands because I forgot to type "by id;" in several hundred lines of code.
30 Well, ideally you would have someone able to put some time into cross-checking the work before you release it.
Yes! parsimon gets it right. On most of the papers I've written, every calculation has been done independently by at least two collaborators. Some of the ones that required more coding infrastructure weren't fully independent, but there was still quite a bit of "let's-sanity-check-these-results" going on. The thing about using the wrong rows in the Excel formula or whatever would have been easily spotted if the two authors had kept independent notebooks doing the same calculation.
45 the fact that R & R are Harvard professors -- this seems noted in every single article -- has something to do with their presumed soundness
Which is really a stupid thing to presume.
The soundness of Harvard professors, that is.
46: See you've put it back-to-front: you're supposed to go to the hotel room and THEN write the talk (in the essaer method). Now you'll just have to invent an entirely novel statistical act.
soundness as professors
I choose to interpret this in the equine sense.
49: And you shuffle data, and multiply all of one input range by 2 (or -1), and invent data from Opposite World. "Do most of your work on your second best days. Test it on your best days."
In their WSJ response to the criticism, the authors seem to be saying that the results are basically the same regardless.
It is hard to see how one can interpret these tables and individual country results as showing that public debt overhang over 90% is clearly benign.
"This Time Is Different" was a pretty good book. Don't think it included this 'finding'.
It's incredible common to find coding errors and the like when people try to replicate economics findings. I can think of many examples just from my own experience, it seems like when a complex empirical paper gets subjected to real scrutiny it's more likely than not to find errors of some kind. I think there's a big selection bias in 'noteworthy' papers in that observational data on big economic issues is basically soup and if you find something that looks clear it's more likely than not that it's the result of picking out a finding that had some error or questionable modeling decision. Note that can happen without a deliberate intention to commit fraud -- it happens naturally when you tinker with a million different specifications.
Trying to find some kind of inflection point in economic growth based on a specific 'threshold' debt to GDP ratio is incredibly difficult when you think about it. There is very limited macro data, it's difficult just to specify a reasonable model, issues of serial correlation in time series data come up (we discussed those in another thread), there's no particular reason to think that the causal influence of an incredibly aggregated variable like debt to GDP would be similar across countries (do you really think investors look at a 90 percent debt to GDP ratio similarly when the debt is held by Japan vs. when it's held by Cameroun? Controls for country only take you so far in that). And that's just scraping the surface. This was never a believable finding, it always felt like an attempt to use numbers as political rhetoric. It had been torn up by EPI and others before anyone found an attractive smoking gun like an Excel coding error.
If you wanted to have an honest debate on this issue you'd just drop the regressions and make arguments based on clear observational patterns, like e.g. no one who had a very high debt to GDP ratio has ever grown much over the subsequent period. That's clearly not true so you are stuck with an argument that will always be assumption based.
Oh, wait. There are some very simple tables of figures at the bottom of the WSJ reply. Sorry, I'm slow tonight, but the emphasis seems to be on the fact that R&R's critics have analyzed just 1945 to 2009, while R&R have analyzed figures since 1800.
Huh, well, I really don't know what to say to that.
57: saying that it's not clearly benign is very different from saying that it's clearly harmful. Maybe it's not clearly anything because the macro data can't answer the question. The reverse causation problem, which I didn't mention, is also enormous -- why did you get into so much debt?
saying that it's not clearly benign is very different from saying that it's clearly harmful
Right.
58: "This Time Is Different" was a pretty good book. Don't think it included this 'finding'
I'll take this under advisement, by the way, and not roundly repudiate everything these people have done.
I do think they have a public responsibility to account for the political/policy ends their paper has occasioned.
The fraction of academics who will readily admit they were wrong about something seems to be pretty small.
Of course, some of us just avoid being wrong, so it never comes up.
49: that sounds nice. I don't think that would work for me, since most people in my lab don't understand what the hell I'm doing.
65: are killer robots really that complicated? Or are the people in your lab that stupid?
44
Peer review might not catch the excel error if they didn't submit the raw spreadsheet, but would absolutely have called out assumptions 1 and 2.
I believe this is incorrect. The choices that are being objected to were not explicitly spelled out in the paper (at least that I could see while skimming). This is a high profile paper has been available on the web for years. This implies it has been subjected to much more scrutiny than a typical peer review. If this didn't pick up the problems it is unlikely peer review would have.
49 sounds great, but I wonder if there are even enough statisticians to do the first calculation.
66: it's not really a killer robot kind of a lab. I'm off in my own weird world.
Peer review is not terrible at rejecting outright crackpot papers, but it's kind of a leap of faith to expect it to do more than that. (Sometimes it does! But Sturgeon's law applies.)
68: I didn't have the impression they were doing anything that involved much statistical know-how in the first place?
It's hard for me to envision the peer review process in disciplines that don't really do experiments, like this paper. They ... claim they did these statistical analyses. How can we prove them wrong? And if we aren't privy to the original data, we can't tell if other analyses would have been better, right? What's the role of peer review? Just to make sure the "conclusions" aren't complete non sequiturs?
Sorry Parsimon but I think several other people are more qualified to judge econometrics than me.
it's not really a killer robot kind of a lab
To describe myself as disappointed by this hardly does justice to the depths of my sorrow.
71: Yes, it looks like they weren't. I was thinking more generally.
72: It is basically assuming the authors are describing things correctly, did they do it correctly. Nobody replicates the analysis or audits the data.
Once the robot kills, you can't replicate because of IRB issues.
Also, I accidentally just listend to Planet Money (I was driving, the radio was tuned to NPR, and I didn't turn on Magnetic Fields before a story hook caught my attention), and R&R came in for some pretty harsh treatment. The takeaway from the story was, predictably, "We can't possibly understand this sort of thing, but these guys made a really dumb mistake -- or worse -- and they poisoned the conversation about the economy nationally and internationally. And now back to our non-coverage of Boston." At which point, I turned on Magnetic Fields, because I've tuned out the media for the past few days.
The reverse causation problem, which I didn't mention, is also enormous -- why did you get into so much debt?
Yglesias, who as you might expect has been all over this story, has mostly focused on this point. Even if the correlations they found hold up (which they apparently don't), that still doesn't say anything about causation, which might just as well go the other way, and in any case none of this supports any of the policy recommendations that people have apparently been drawing from the study.
Once the robot kills, you can't replicate because of IRB issues.
Fittingly, once the robot replicates you can't kill because of IRB issues.
If the killer robot is sent to the IRB on your behalf as proxy, does that make it better or worse?
Because this sort of thing is quite normal (see 8 10 15). You stop looking for mistakes when you get the results you expect.
This is a highly-visible highly-controversial topic. Even in the "real world" a good scientist has to anticipate potential objections. You don't need to be a Harvard professor to be able to raise objections like "what would happen if you didn't exclude those years" and "what would happen if you weighted by year instead of by country."
This is a high profile paper has been available on the web for years. This implies it has been subjected to much more scrutiny than a typical peer review. If this didn't pick up the problems it is unlikely peer review would have.
The paper has been available for years, but the raw data and the assumptions behind the analysis have not been available until very recently -- no one has been able to really replicate their results until now. A good peer reviewer would have criticized the authors for not addressing robustness explicitly. The American Economic Review also has a policy of only publishing papers where the data are readily available to any researcher who wants to replicate the work.
In their WSJ response to the criticism, the authors seem to be saying that the results are basically the same regardless.
"It is hard to see how one can interpret these tables and individual country results as showing that public debt overhang over 90% is clearly benign."
They are just shifting the goalposts -- the policy argument is not over whether large public debt overhang is "clearly benign," but whether its costs are worse than the effects of austerity.
Basically R&R traded on their academic reputations to artificially lend respectability to austerity policies. Macroeconomics turns out to be much more politics than science.
If the killer robot is sent to the IRB on your behalf as proxy, does that make it better or worse?
If it kills the IRB that certainly streamlines the process of setting up future experiments.
This is a high profile paper has been available on the web for years. This implies it has been subjected to much more scrutiny than a typical peer review. If this didn't pick up the problems it is unlikely peer review would have.
Oh yes. Because a bunch of ill-informed commentary in my facebook feed is so much more valuable than a careful review of the data by some few who actually have a sense of the methodology.
Likewise, the fact that the anti-vaccination activists are all over the new social media with their anti-vax "theories" must mean that medical science is wrong about the benefits of vaccination. I mean, where's the value of peer review in medicine and epidemiology when the anti-vax movement has such a strong presence on the internets, after all?
81
The paper has been available for years, but the raw data and the assumptions behind the analysis have not been available until very recently -- no one has been able to really replicate their results until now. ...
The raw data and assumptions normally aren't available to reviewers either. And reviewers aren't expected to replicate the results.
... A good peer reviewer would have criticized the authors for not addressing robustness explicitly. ...
Not all peer reviewers are good. To say the least.
... The American Economic Review also has a policy of only publishing papers where the data are readily available to any researcher who wants to replicate the work.
This is nice but not typical. And it is open to interpretation. For example what exactly does "any researcher" mean? Do you have to have some unspecified level of academic credentials?
84
Oh yes. Because a bunch of ill-informed commentary in my facebook feed is so much more valuable than a careful review of the data by some few who actually have a sense of the methodology.
Making the paper available to all does not preclude careful review by experts.
84: Ill-informed commentary on Facebook and peer review for a journal are not the only possibilities. There's also expert but informal peer review, outside of the publication process, which is more likely to happen if people post more details of their work online. Anecdotally, at least, in my field people are much more likely to get useful commentary via email (or even Facebook!) after posting a paper to the arxiv than from a peer reviewer selected by a journal, especially if their work is of much interest. (Peer review will force someone to read the boring papers no one will look at on the arxiv, but who cares?)
18 -- & my guess for NZ's GDP growth rate in '51 being a bit dodgy is the '51 waterfront lockout...
Not all peer reviewers are good. To say the least.
Well it remains the case that this paper was not subject to peer review. Perhaps peer review would have failed to prevent an identical paper from being published, but that's hardly a logical necessity. Robustness is an entry-level issue, when you only have a few data points it's basically the whole game.
81: The top economics journals -- of which AER is one -- all require you to submit your code and data, so it's not atypical for economics. The only reason R&R appeared in its current form is that it appeared in this AER P&P issue, which is the exception to the rule.
||
Hopefully the last word on the late Lady Thatcher: unmissable scenes from around the country of people mourning on the day of her funeral. And yes, they show it on big screens in provincial cities.
|>
Oh man. My immediate reaction to this story was a feeling of terror in the pit of my stomach. I'm sure they're prevaricating asswipes but the idea of accidentally doing something like this in my research is totally haunting.
It happens. (bragging) my happiest academic moment as an undergraduate was reading a (widely cited) epidemiology paper by my department head (R/oy M. A/nderson) not following the argument, reading it again, making a large amount of tea, drinking it, reconstructing the paper's entire argument step by step and finally working out that the reason the paper's conclusion didn't make any sense was that he had accidentally got an equation the wrong way up halfway through, and that was why he was asserting X proportional to Y when it should have been proportional to one over Y.
And, ajay? (And your career has flourished ever since...)
Ajay is now banned from graduate schools worldwide.
Ajay is now banned from graduate schools worldwide.
Ironically I subsequently came very close to being banned from my undergraduate institution for life, but for completely unrelated reasons.
Unexpected Flashman type behavior from the hitherto respectable ajay!
Was it for showing up in chapel four sheets to the wind or did you shove some underclassman's arse in the Common Room fireplace?
Unexpected Flashman type behavior from the hitherto respectable ajay!
My God, you've been saving that line up for the last eighteen months, haven't you?
"Expelled for drunkenness? No! Well, damme! Who'd have believed they would kick you out for that? They'll be expellin' for rape next. Know what they expelled me for? Mutiny! That's right, sir! Led the whole school in revolt!"
91: The link is phenomenally mean-spirited, but that's not the only reason I enjoyed it.
If this didn't pick up the problems it is unlikely peer review would have.
It should be noted that the bulk of the problems were, in fact, identified on the Internet. The procedural issues weren't specifically identified because the procedure hadn't been disclosed, but the paper has long been understood to have been contrary to the available facts.
You might as well say that given the high-profile nature of the build-up to the Iraq War, you'd think that careful observers would have revealed on the Internet that the WMD claims were bogus.
And did you see how a bipartisan panel recently discovered that Americans tortured prisoners? I was shocked, given the fact that this information hadn't been widely disseminated on the Internet.
There's a huge disconnect between respectable opinion and actual reality, but reality is still out there, and every now and then it's going to bite people like R&R in the ass.
"Expelled for drunkenness? No! Well, damme! Who'd have believed they would kick you out for that? They'll be expellin' for rape next. Know what they expelled me for? Mutiny! That's right, sir! Led the whole school in revolt!"
Awwwwwwww yeaaaaaaaahhhhhhh. [Flashman-themed Super Bowl beer commercial of my dreams.]
OT: Nobody panic, but I think TWYRCL may be on to my best all-purpose most sincere excuses explanations: i.e., "Mistakes were made" and "Every war has casualties."
90
The top economics journals -- of which AER is one -- all require you to submit your code and data, so it's not atypical for economics. ...
Do these include "The Quarterly Journal of Economics" (which published Levitt's abortion paper)? I can't find any such policy stated on its website .
100
It should be noted that the bulk of the problems were, in fact, identified on the Internet. The procedural issues weren't specifically identified because the procedure hadn't been disclosed, but the paper has long been understood to have been contrary to the available facts.
You think wrong papers never make it through peer review?
re: 103
Given the nature of scientific inquiry, it'd be more the case that to a reasonable approximation, _all_ papers that make it through peer review are wrong.
This is a different kind of wrongness, though.
104
This is a different kind of wrongness, though.
Computer coding errors are unfortunately quite common. For example at least two of Levitt's papers contain them. See here.
91: It's really only hit me now that the worst part of this terrible tragedy is that she won't die again.
102: QJE is a top economics journal, but I don't see anything on their website either. I'm surprised. Econometrica, AER, and Journal of Political Economy all require you to submit your data, so for the top journals, QJE is the exception.
Was the data for the Levitt abortion paper hard to get? IIRC, the paper that showed that Donahue and Levitt fucked up thanked them for the data, though who knows what that means in reality.
A nice follow-up from a guest-poster at Rortybomb.
As is evident, current period debt-to-GDP is a pretty poor predictor of future GDP growth at debt-to-GDP ratios of 30 or greater--the range where one might expect to find a tipping point dynamic. But it does a great job predicting past growth.
Flashman-themed Super Bowl beer commercial of my dreams.
See the Rik Mayall ads for Bombardier beer, which are very definitely Flashman-themed.
108 is why the article would have never passed peer review. The paper is just a bunch of summary statistics, with no statistical analysis, and no effort to tease out causality.
72: You can look for really obvious errors like:
"Your data is a time series. What did you do to account for serial correlation?"
"It seems like your predictor is correlated with a bunch of other confounding factors. Did you ever consider controlling for them?"
"You didn't cite my five plausibly related publications. REJECTED."
& my guess for NZ's GDP growth rate in '51 being a bit dodgy is the '51 waterfront lockout
which was obviously caused by the debt going over 90% of GDP. Duh.
So, question for the academic/research types. This sort of thing comes up fairly often -- an influential paper turns out to be a mess, and the response is some version of "There wasn't any way to figure it out without independently duplicating the experiment/analysis/whatever, and no one does that."
Presumably, students planning on research careers get some kind of formal instruction in research methods. Wouldn't it be useful to educate (while exploiting) undergrads/new grad students, by handing them recent interesting papers and telling them to go replicate the results? If they do, they've learned how to do whatever it is. If they can't, whoever's supervising them can look over their shoulders and see if there's something wrong with the initial paper, and then the supervising faculty person could go do something about that.
I figure this is either obviously a bad idea for some reason, or people already do it. Any idea which?
Dylan Matthews at the Wonkblog has an interesting summary of the fallout from the R&R topic.
Wouldn't it be useful to educate (while exploiting) undergrads/new grad students, by handing them recent interesting papers and telling them to go replicate the results?
The UMass work uncovering the Reinhart-Rogoff snafu was done by grad students.
114.3: people have talked about doing it, and there are some labs in fields semi-adjacent to mine where people actually do it, but replication is in a lot of cases I think a fairly involved process, especially for experiments where the methods require some level of expertise, and then there aren't publishing outfits for replications, successful or failed. That latter problem is a bigger one, really, and people talk fairly often abotu what to do about it.
It does happen some, though, for instance.
(Not speaking for my specific field (where replicability is often sort of a weird metric) or fields too far away from mine, by the way. Just the world I'm glancingly, but not hugely, familiar with.)
114: It might be good training, but it would involve a fair bit of work for people who know what they are doing (higher level grad students at least) and who will either want paid or have very strong incentives to do original work if they aren't getting paid.
If you want to pay someone to independently reproduce your work, you'll get an extra gold star on your paper.
Comment posted from the back row of a scientific conference.
who will either want paid
Yinz sure you meant to write it that way?
The sides of the cut open mice are inducing mild nausea.
If all the citations to a disproved paper were transferred to the record of the disprover, the incentives might cancel out. Then someone disproves the disproof...
Also, there aren't always duplicate samples, some fields get less data and want to hang on to it longer to get their publications out of it (not sure that's a good justification, but it's strongly incentivized), yada yada yada.
Mostly, and this is fleshing out Moby's comment, we try to reproduce methods because we want to use them in our own work, and by the time we can be sure it's the methods that are screwy we need to be doing something original.
There should be more long-term lab techs. That's steady soft money, though, and the PIs I know who have pulled that off for decades are rare.
118: That would be an issue. I was blithely thinking that there are probably enough students who are novices in the lab that you want them to be learning techniques somehow but you don't want them touching your nice clean important experiments.
122: the sides of them? Are you inside one?
What Moby said in 118: as far as I know, in the sciences, graduate-level work is supposed to be original work.
In the humanities, it's not quite that way: people often work at commentary on past work.
125: Probably varies by field, but as far as analyzing data, I think people junior enough to be expected to do work primarily for training require far more effort to supervise to do the work than to just do it yourself. I don't know from labs.
126: Now it's rabbits. I had rabbit for dinner last night.
Apparently, some people just cut open a rodent, take a picture, and drop it into powerpoint.
I thought you were on team rodent death, but I guess I'll just have to carry on the struggle alone.
For the record, here's a report on one of principal authors' -- a grad student -- approach to the matter.
No-one would replicate my data, for instance -- two years of crawling through a scrubby field? I have photographic evidence for some of it, but not all. For maybe a year, someone familiar with the ecosystem could see traces of the larger effects I'd mapped, but that's about it.
I wonder if they use a regular x-ray machine for mice or if they have a special, small machine.
"Most commercially available mouse x-ray CT scanners utilize a charge-coupled device (CCD) detector coupled via fibre optic taper to a phosphor screen. "
via A comparison of x-ray detectors for mouse CT imaging
Don't know that it's small. It could put a hundred mice through at a time in custom racks.
However; re OP; I am reminded of the 1840 census, which was terribly designed in a way that invited an error that added up to data showing that black people in free states tended to go insane. Even though this relied on measuring insane black people in towns that had no black people, even though there was a statistician explaining the problem, John C. Calhoun (and other unscrupulous cads) had got hold of their useful lie and never let it go. War, Reconstruction, Jim Crow, d.c.a.f. if it ever ends.
OT: For all his faults, Yglesias is spot on in his analysis of the economy of Westeros.
He probably gets it right only because Westeros doesn't have a public school system.
136: Thanks. I'm still going to pretend I don't know the answer if I see these people in the bar later.
Why do so many people think gold is innately useless? Have they not looked at their own crowns and cellphone charger contacts? Second best STP conductor, inert, ductile: dayam.
When they say they "sacrificed" the mice, I hope they at least had a lab tech wearing robes and a stone altar.
Jesus christ, Portugal is now to undergo a new round of austerity cuts.
The country is grappling with its worst recession since the 1970s, and is bracing for a record 18.2-percent unemployment rate in 2013, up from 16.9 percent in 2012.
Reinhart and Rogoff have a lot to answer for.
I should just stop talking about this, so furious it makes me.
Regarding sharing code and data here (via McArdle) is a paper which (on a brief look) suggests there is room for improvement:
While most of the top-ranked economics journals have recently introduced a data availability policy [2], the vast majority of journals either do not have a policy that requires authors to share their data or are reluctant to enforce it (McCullough, 2009, McCullough and Vinod, 2003), Also, Anderson et al. (2008) suggest that authors generally hesitate to share their data and code despite their pre-publication commitment to provide this information. This may suggest that editors, referees and readers are confident that the empirical results presented in the papers are always credible and robust. Unfortunately, this is not always the case. Dewald et al. (1986) tried to replicate 54 papers published in the Journal of Money, Credit and Banking and could only replicate two. Late, McCullough et al. (2006) tried to replicate 69 articles with archived data entries published in the same journal and could only replicate 14. Also McCullough et al (2008) tried to replicate 117 articles with archived entries published in the Federal Reserve Bank of St. Louis Review and could only replicate 9. These findings raise concerns regarding the credibility and reliability of empirical work [3].