That is interesting, if not exactly surprising.
Funnily enough, in my not-reversed-quasi-pseud-form I was at the A / rxiv a few weeks ago, talking to the person who wrote those filters. They (A /rxiv) are having a technical review, so a bunch of us came in from other institutions for a workshop.
The eldritch arxiv gods put out a big questionnaire recently asking users what they liked/didn't like and what they wanted changed. My response was mostly "don't change anything". They've done a remarkably good job of maintaining quality over the years given the sort of site they're running.
In the linked piece, I thought it was interesting that stylistic things like the distribution of stop words could identify crackpot submissions. I suppose it's not too surprising. Ever since the usenet days, there's been a style - IMO usually involving huge walls of text with no structure or paragraph breaks - that's been characteristic of cranks.
Anyway, machine learning, philosophy of science, linguistics (sort of) and crackpots: it seemed like perfect unfogged fodder. Thanks for posting it!
Yes, I've seen a version of the output from that questionnaire. I expect I'm not at liberty to go into detail, either about the report or the technical plans for the future, but I doubt if I'm violating any confidence -- since it's all positive -- if I say both user satisfaction, and uptime/reliability are and were both very very high for that type of project.
Man, it would be awesome if courts had an initial crackpot filter to screen insane pleadings out before anyone had to reply to them -- a "this is too crazy to bother the person you're purporting to sue with" stage. It's fundamentally impossible, because sometimes insane incompetents are right about having a real world grievance, in a way that doesn't apply to academic publishing, but it would be lovely.
It's a "very reliable" filter in the sense that the things they screen out are very likely crackpot papers, but plenty of crackpot papers still get through. It really is because of "familiarity with contemporary academic jargon": crazy people from outside will get filtered out, but crazy people employed as academics will generally not. Which I suppose is the desired goal, because people don't want to offend their colleagues, even the crazy ones.
That seems unnecessary. How hard is it for a crackpot to learn some jargon? You don't need Mr. 10,000 Hours mastery. Maybe get a few journals and do an hour of reading a day for a couple of months. Anybody who fits the whole "crackpot" concept should have that kind of time. We could have higher quality crackpots if we wanted to.
I like the crackpots who self-publish glossy books of their theories and send me copies in the mail.
I guess that counts as "higher quality" also.
I like the crackpots who self-publish glossy books of their theories and send me copies in the mail.
US TOO.
...there's been a style - IMO usually involving huge walls of text with no structure or paragraph breaks - that's been characteristic of cranks.
Must remind myself to put more paragraph breaks in my work emails.
More paragraph breaks, fewer uses of the word "tachyon". That's 75% of the battle right there.
I bet uses of the name "Einstein" outside of specific phrases (Einstein-Hilbert action, Einstein frame, Einstein-Maxwell theory) would work pretty well for filtering out the amateurs.
7: I'm coming at this as a lawyer, but I think "inability to learn to parrot the jargon pretty well" is literally diagnostic of whatever the mental disorder is that makes you a crackpot (not that there aren't insiders who think crazy stuff as well, as essear notes, but it's sort of a different pathology). I see a lot of self-represented litigants. Some are crackpots whose papers are transparently lunatic, some are illiterates with a grievance which is sometimes legit and usually not, and some are bright people who aren't lawyers who do parrot the lingo and jargon pretty well, where the papers look like they were drafted by a lawyer until you see something jarringly off and realize that the writer is faking their way through. But the last category, who are faking it pretty well, are in my experience never substantively nuts -- they may be right or wrong about the merits of their claims (usually wrong), but you never see pretty good papers from someone who's claiming that some midlevel Tax Department official is engaging in a campaign of harassment by causing the plaintiff's employer to send him W-2 forms every January.
Doesn't the tax department cause everybody's employer to send a W-2?
Well, yes -- I was trying to describe someone whose grievance was self-evidently insane.
I suppose it's not intended to be harassment.
THAT'S HOW DEEP THE CONSPIRACY RUNS!
Even though it isn't supposed to be harassing, it sort of comes across as "...and here's one more thing for your to do list."
Which reminds me that maybe I should check in with the IRS and see if I need to do anything else about the guy who filed taxes using my SSN.
Maybe that was me from the future because of the tachyon storm?
When I was an intern for a federal judge, the clerk assigning files to me used to smirk when they were obvious nutballs, "this could be the next Gideon v Wainwright." We both knew it wasn't going to be the next Gideon v. Wainwright.
Gideon v. Wainwright sounds like a western.
And for some reason occupies the space in my brain where Merrick Garland should go. I have to keep on looking his name up to remember it isn't Gideon Wainwright.
Gideon looks sort of under powered for either a guy who spent years in prison or the instigator of a major legal reform.
He does sort of look like an underfed Merrick Gardland.
25: He does look a little like Henry Fonda who played him in the TV movie if I'm remembering correctly.
Henry Fonda, like Marlon Brando, demonstrates that all great actors come form Nebraska.
3: I'm also on team CHANGEBAD on my survey response. I'm not sure why they're even asking about comments given the trend of comments on the internet.
It really is because of "familiarity with contemporary academic jargon": crazy people from outside will get filtered out, but crazy people employed as academics will generally not. Which I suppose is the desired goal, because people don't want to offend their colleagues, even the crazy ones.
A lot of the crazy people employed as academics will also be filtered out, because they spend 20 years getting tenured in structural engineering or something and then embark on a crusade to teach the world about the meanings embedded in patterns of sunspots.
plenty of crackpot papers still get through
True. In the quantitative biology and statistical mechanics sections where I do most of my browsing, there's this one guy who, several times a year, posts a paper in which he claims to have used information theory to uncover the secret of life, the universe, and everything.
He's been doing this ever since I first started using the arxiv back in the 90s.
Then he must be tenured at it by now.
Interesting development in the comment section of the linked post. A certain famous theoretical physicist who won the Nobel prize at a young age and subsequently developed an interest in various forms of parapsychology and psychic phenomena showed up in the comment thread to complain about his own difficulties in getting papers accepted to the arxiv.
35 would appear to be an instance of the phenomenon described in 32, if AL's summary is correct. But from famous theoretical physicist's comment it's not clear whether his problem is restricted to his unconventional work or whether he also has difficulties with his more mainstream stuff.
We could have higher quality crackpots if we wanted to.
Crackpot Bootcamp!
36: He doesn't have any more mainstream stuff. He's been far out on the tail of craziness for decades now. If they're keeping Josephson from posting, that's a sign that their algorithm is working.
It's probably better to have won the Nobel prize and gone crazy than to have just started out crazy. Not that I have a basis for comparison.
'Better to have Nobeled and lost it than never to have Nobeled at all.'
Josephson? The guy from the junction? I am surprised that he's still alive. Wow, he's only 75!