It is not going to be very accurate. At best it could reach the accuracy of the average random stranger evaluating you.
I have a system that does the same thing, but I don't try to put a year number on it, because it would be too misleading and confusing.
I did a pic of me, scowling, wearing glasses and a hoodie, and it said I was 82. Then I did a pic of 4yo smiling and wearing my glasses and it said he was 62. Then I did an intentionally crappy selfie of me lying down in bed and it said I was 35. Then I did a "nice" pic of me and it said 27. So I am now 27.
On 4 out of the 5 photos I tried, it placed me at 37. I'm 43, so I'm thinking it displays amazing acuity about personal appearance.
It consistently guesses xelA at 2, but thinks he's a girl.
I tried it on the picture of me with my law-school friend in the flickr pool. She's 42, I'm 64.
It makes me eight years younger than I am from my Twitter profile pic, and eight years older from my FB one. So spot on on average, but with a rather high margin of error.
It mostly gave me a feeling of "crap!" since I've been planning to put my version up for people to try for a while now, but Microsoft of all people stole that thunder.
I tried a recent picture that guessed ~right at 36. Then I tried a scan of an old (low photo quality) picture from ten years ago, in which it guessed I was 57.
7: Well, you may have noticed that the MS version isn't very good. Results seem not a whole lot better than random, and inconsistent from one picture of the same person to the next.
(Mostly, I think the feature it needs is an ability to reject a picture as "I'm not going to make a good guess based on this picture.")
9.1: I think part of the reason for that, though, is that people aren't very good at this, either. The MS version makes weird errors that no human would ever make, and it doesn't use a lot of information that people would use, but if you asked a hundred random people across the world to guess your age from a hundred different photos, you'd get a lot of variance, and the mean might not be at all accurate.
It definitely makes some errors that are very human-like; it seems sort of racist, for instance (rates Asians as looking younger than Caucasians fairly consistently).
But it makes a lot of inhuman errors -- did you see Biohazard putting up a picture of Domineditrix as a beautiful 17-year old, and it called it as 51. That's not a human mistake. (Neither is calling me 64 in the picture I linked. A person might call me anywhere between 30 and 50-something, depending, but mid-sixties isn't a mistake a person could make.) Where it's making obviously inhuman errors all the time, I don't think it makes sense to give it credit for errors within the range where a human might make the same mistake as being made for the reasons a human would make them, as opposed to just being presently incapable of the task it's trying to do.
So is it going off of proportions? Or skin quality?
If you have a learning mechanism, where you get feedback from users and refine your search algorithm automatically, does the programmer have to accurately predict the number of "bins" that people clump around (maybe races in this case) or can learning mechanisms creatively discover multiple reasons that they're wrong? It seems like the program could get stuck in a misleading loop where one bin is always calling a calculation an error, and then another bin is always calling the same calculation spot on, and so the game forever corrects itself in a circle.
(I'm just musing - I'm sure it varies dramatically based on context.)
It's going off complex features that aren't going to be easy to talk about in words. I'm sure that things like lines on the face end up being important, but generally the way these this work are that you feed it a lot (many thousands) of labeled training samples and let it figure out what's important for itself. How you then figure out what features it cares about is an ongoing topic of research (by me, among other people). The "misleading loop" that you mention I don't know about, but a big problem with this kind of learning algorithm is overfitting, where the trainin samples with the same label have something in common that isn't actually important in the larger world. So for a simplified example, say that every training image where the person was 72, they were also wearing Blue Blockers. The model would then become very convinced that anybody wearing Blue Blockers was 72, even though that's not quite a meaningful characteristic of age in the way you would want to think about it.
Is there any way, for that kind of machine learning, to inspect the software's 'thought processes' and eliminate false trails?
Like, if you were teaching a person to guess ages, you could ask them "What were you thinking" after an error. And if you got an answer like "That guy had to have been 72, he was wearing Blue Blockers, all and only 72-year-olds wear Blue Blockers," you would either tell the person, "wrong, ignore the sunglasses" or you would show them a bunch of non-72-year-olds wearing Blue Blockers and 72-year-olds without, and let them figure it out for themselves.
Is something similar very hard to do for machine learning, or do people not do it because the problem is to develop software that can learn without human intervention, or do people do it but it doesn't always work?
It was shockingly accurate for me and wife. A two year old pic of me, exactly right. Current pic, exactly right. Wife who is often mistaken for much younger, right for her next birthday in two months. Kids not so much though.
||
After hearing two stories in the last week about students who got concussions from randomly tripping and falling, I just tripped in the subway stairs and banged my forehead. Somehow my glasses didn't break and I feel mostly okay except for a bit of bleeding from my nose and a big lump on my forehead, but I guess I should go somewhere to get it checked out?
|>
But then, I can also often buy clothes right off the rack with no alterations needed. The world revolves around me.
Added 14 years to my age. Obviously worthless.
I was just about to say, so you're all sending your selfies to Microsoft?
Dude, essear, that sounds painful. You know, it's overwhelmingly likely that it's nothing, but some small number of people take a small bump on the head, get a bleed, and DIE. So yeah, even though you're fine, it's worth getting checked out, preferably by someone who takes brain injuries seriously. Also, if you think you have a concussion, get off the internet and get some rest.
I put in a picture that was taken when I was 36. Its said 38. I'm 38 now, so basically that thing has uncanny accuracy and knows more about me than it is letting on.
15:
I'm not an expert on this (I've worked in ML, but mostly on the systems side, trying to make things go faster, not trying to improve the learning algorithms), but I've worked with people who are considered to be among the most well-known people in the field and I've asked them questions like these.
With that caveat, I think the problem is that instead of getting an answer like "That guy had to have been 72, he was wearing Blue Blockers, all and only 72-year-olds wear Blue Blockers", you produce (for example, if you use deep learning) a system of (say) 11 matrices, and when you do the appropriate matrix multiplies against the input image, you get some kind of output, so the equivalent statement is really incomprehensible to humans. You can tell when a system is generally overfitting (as in the Blue Blocker example), when it's underfitting (doing the opposite, basically), or making other systemic errors, but if you try to stare at a single result you get an answer like:
"That guy had to have been 72, the output of the first stage is a 224x224x3 matrix with the value X, the output of the second stage is a 55x55x48 matrix with value Y, etc."
And that kind of thing is very hard to interpret. If you do something that's very simply and can be "learned" with something simple (e.g., only 2 layers of matrix multiplies), you can usually look at each matrix and come up with an interpretation of the matrix that's at least vaguely understandable by humans, but for state of the art stuff on the hardest problems we're able to solve effectively, it's hopeless.
On human vs. inhuman errors, you can see this in dramatic fashion if you know the specific algorithm (that is, the exact series of matrix multiplies, not the general learning algorithm), and try to fool it. See, for example, https://www.youtube.com/watch?v=M2IebCN9Ht4.
However, on images that are "similar" to images that the system was trained on, results are often very good -- for example, Google Street View's digit recognition has a lower error rate than humans on actual images from Street View, and many other systems are at around the same level for "realistic" inputs. I don't claim that this system is that good (it sounds like it isn't), but a system that is that good will still produce errors that seem absurd to humans.
or you would show them a bunch of non-72-year-olds wearing Blue Blockers and 72-year-olds without, and let them figure it out for themselves
This is pretty much what you do -- you develop a training set which doesn't have any spurious correlations -- but it turns out to be extremely difficult to do well. What you really want is a training set that contains all the possible variation in natural images for the relevant class labels with no spurious correlations. But because the features that the model uses are discovered on the fly, you don't really have an a priori way of knowing what those spurious correlations are going to be, and in fact it turns out to be quite difficult to figure them out even later on, because you can't easily directly query the model to figure out what it cares about.
you can't easily directly query the model to figure out what it cares about.
That was exactly what I was wondering.
24.last I thought Google Street View was using crowd-sourced CAPTCHAs for digit recognition.
If you want to play around directly with fooling these algorithms, there's a demo here where you can upload (or link to) an arbitrary image and cause a (state-of-the-art) classifier to categorize it as whatever arbitrary class of object you'd like, by adding what looks like random noise.
It actually seems to be a way to collect a vast amount of metadata from people's photo collections, without telling them that's what you're doing, for reasons so far unspecified.
27: Street View uses crowd-sourced CAPTCHAs on examples that it can't classify. They are essentially doing something like LB asked about; "tell me what you got wrong, and I'll find the correct answer for you, and then you can learn from that". But the CAPTCHA responses are aggregated over a large number of users, so compared to any one individual Street View is going to do better.
29: In general, if there is ever a site that offers you the opportunity to upload information so that a computational model can guess something about you from it, they want you to upload that information so that they can use it as training data to improve that model or some other model.
But because the features that the model uses are discovered on the fly, you don't really have an a priori way of knowing what those spurious correlations are going to be, and in fact it turns out to be quite difficult to figure them out even later on, because you can't easily directly query the model to figure out what it cares about.
cf. the probably spurious story about the US Army teaching a neural network to recognise tanks in aerial photos, and using a training set of 50 photos of tanks (taken on a sunny day) and 50 photos of non-tanks (taken on a cloudy day) and only realising months later that what they had actually produced was a neural network that could tell you if it was sunny or not.
In general, if there is ever a site that offers you the opportunity to upload information so that a computational model can guess something about you from it, they want you to upload that information so that they can use it as training data to improve that model or some other model.
But this doesn't seem to be the case here - the project is not about collecting photos at all. The photos aren't being collected (see the article) - just the metadata.
Also, the question of whether or not these algorithms actually do better than humans is a very difficult one to answer. There was a bunch of press recently claiming that various face recognition algorithms were now better than humans, but those results were on a database of 13,000 or so images that turns out to probably be "too easy", in terms of not having as much variation as a human observer would "realistically" encounter.
I really screwed up my Presidentiality, I just realized.
33: Right, but the metadata is itself useful.
22: I'm at a clinic now. Seem to just need an ice pack and maybe a tetanus shot just in case.
So this thing is pretty inaccurate, or all my sags and proportions change dramatically when I smile.
This is why it's so helpful for men to tell women to smile.
34: Yeah, the circumstances for a lot of these results are contrived, which is something I don't see talked about a lot on the internets (HN, reddit, etc.).
The results from last year's imagenet competition (http://image-net.org/challenges/LSVRC/2014/index) are really impressive! But of course in one challenge there are only 200 categories (8 bits) and in the other there are 1000 (10 bits). Humans can surely classify hundreds of thousands (maybe even millions) of things, but using current published algorithms it's too computationally intensive to train a system that can recognize 1M things (20 bits).
MS has talked about using FPGA accelerated deep learning, which should buy them something like 10x-100x in performance, but scaling this up isn't linear and I doubt that gets them to the point where they're in the same league as humans in terms of the number of different types of things that are recognizable. They have enough resources to build huge clusters of those things, but even that's only going to buy you, say, 100,000x in performance, which is probably not enough to get there without better algorithms.
Street view and many other applications that seem to do really well have the advantage of being reducible to a classification problem with relatively few possible outputs (plus figuring out what part of the image needs to be classified),
The classifier got me almost exactly correct, but it's easy to imagine that 30-something white male software engineers are overrepresented in their training data.
Is there any way, for that kind of machine learning, to inspect the software's 'thought processes' and eliminate false trails?
Have it play tic-tac-toe solitaire. Duh.
By the way, this is one of the things I find most baffling about the LessWrong / MIRI belief that some kind of super AI is in our near future. There's a job posting floating around that's originally from about ten years ago, when they wanted to hire "Seed AI" programmers who would somehow create a singularity inducing AI.
And then over here in practitioner land, the speedup you get with each new generation of hardware has slowed down so much that large software companies like MS are hiring giant teams to build dedicated hardware to accelerate specific tasks (which is a one-time fixed gain, not an exponential gain like Moore's law). Even with those efforts, ML folks can't get systems folks to produce systems that are fast enough to allow them to train systems that produce "good" results on tasks that are actually quite narrow.
And then we have people proclaiming that, within ten years, we'll have some kind of superhuman AI that's better than humans at all tasks.
For some reason it's always within ten years.
Somewhere around my apartment I still have a copy of a book someone gave me (an awful one) which cheerfully asserted at the beginning that we'll have solved the problem of human consciousness and replicated it in computers within ten years or so. I think it came out in 2004 or so. My first thought when I read that was "I'll have to send this guy an email asking how that's going in ten years or so" but I haven't gotten around to it yet. I probably should though.
I think it was Radiant Cool: A Novel Theory of Consciousness, which is a "novel of ideas" and is exactly as well written and insightful as the title suggests. But I can't track down the relevant quote so I may be misremembering where it was from.
Speaking of algorithms, our second grader appears to have invented a truly novel math problem solving act. He and his brother were working on some multidigit division problems for a math-a-thon fundraiser- e.g. 4957.2 / 91.8. They don't do the standard long division algorithm yet because it's too rote, they try to do it in a number-sense common-core approved way first (thanks, Obama!) So they're encouraged to first eliminate the decimal- 49572 / 918. Then for the divisor they're supposed to estimate how many times it might go into the dividend by looking at place value and the first digit, guessing easy factors e.g. 50 or 10 or 5 or 2. You multiply the divisor times your guess, subtract from the dividend, get a new remainder, and repeat with another guess on the smaller number, then when there's no remainder add up all your guesses.
Second grader said F that, estimating your guesses is too inefficient. So he started with the divisor and made a doubling table- 918, 918 * 2, 918 * 4, 918 * 8, etc. Then he subtracted the largest power of 2 * divisor that was still less than the dividend. Then repeated on the remainder using the next appropriate power of 2. Then added the powers of 2 together- basically solving for the quotient expressed as a binary number. It was way faster.
That seems to me like a method that a computer might use for factoring given the binary nature of it- anyone familiar with crytography know if that's similar to how you factor large numbers?
Or maybe that belongs in the funny things my kid said discussion.
Excuse to cancel multiple meetings with time-consuming visitors who are all leaving town tomorrow, updated whooping cough shot for protection from crazy anti-vaxxers, opportunity to spend the afternoon at home resting with an ice pack: I should hit my head more often!
That sounds reminiscent (although solving a different problem) of the technique known as Ancient Egyptian multiplication, among other ethnic names.
47: A placebo blow to the head would probably work just as well.
It feels so good when you stop and get excused from of work.
Somebody just cited a paper I was on.
This reminds me of IBM Watson doubting that that humans are placental mammals and placental mammals are animals.
https://news.ycombinator.com/item?id=9006414
Or, for that matter, the 1980s academic medics who managed to create a racist computer:
http://www.theguardian.com/news/datablog/2013/aug/14/problem-with-algorithms-magnifying-misbehaviour
It's not that hard to create a racist computer, it turns out. I've done it myself.
It feels so good when you stop and get excused from of work.
I have a lifelong habit of fantasizing about mild injuries and illnesses that excuse me from life, for a while.
Then I get stuck arguing with myself on whether the cons of being sick or injured outweigh the pleasure of vacation in bed. I tend to get very detailed about it, "Would my back hurt, laying in bed that long? What if I wanted to get stuff done around the house? Would I be able to? Would I get out of shape?"
It thought a picture of me at Veselka after a shvitz two years ago was 23. I am not trying any other pictures.
If we were in a certain kind of movie, essear would now no longer be good at teh skience but would be a brilliant poet or something. I might be currently making up the poems he would write collaboratively with another commenter elsewhere, but SOOBC forbids I say more.
45: If I'm understanding that correctly, that's a great trick! It's the kind of thing people sometimes do to speed up algorithms. However, my understanding is that the state of the art for factoring large numbers relies on more sophisticated math and that the algorithms often rely on randomization. I've only skimmed this page but AFACT there are multiple algorithms they link to that I don't have the math background for, and I was a math major.
And the scientists at Puppy School would discover that you can do even BETTER science with the brain of a poet, because godpoetryuniversemusic* or something.
* not to be confused with bloodsugarsexmagic
||
Live blogging! We have doner and Turkish delight.
|>
||The fake, paid for B/yron A/llen "AT&T, stop being racist" protesters have now moved to another law building (thankfully far from my window) but have totally stepped up the game for their fake protest. They now have a big sign up that says "Honk Against Racism" and are getting lots of honks from drivers who are presumably against racism|>
62: There's a place to get döner up there?!?!
Live blogging! We have doner and Turkish delight.
Sounds tasty.
Döner?! I've hardly MET her!
I apologize. You don't get to make that weary joke in perfect tenses much.
67: Bonus points for the double sausage joke, too!
Thanks to clew for showing me around and almost killing me with the stairs.
I feel obligated to note that she used the traditional greeting that I've not heard from others on a first meeting where you weren't sure about having the right person.
Young men in black clothes and masks.
"Do you have 'dick' tattooed on your arm?"
(A commenter approached Jammies with that, at an Austin meet-up long ago. I know exactly who it was and remember all sorts of things about him, but I'm blocking at his pseud.)
||
In case anyone was wondering, I still hate all the doctors.
|>
Is döner different from shawarma? I've never been clear.
This was chicken, so yes. It's closer to a gyro.
We forgot, or possibly were too wise, to publicly announce where we were meeting. So no stalker lurkers. Probably just as well.
After going up the stairs Moby explained that his conference is explaining that a lot of knee surgery doesn't work after all. Oops.
72: If it helps, I'll shout that out here. I am surrounded by hundreds of them.
72: Sympathy. Hopefully whatever they're screwing up can be remedied.
At, what now, E.?
77: do doctors normally like each other?
76.2: Buck could have told you that.
I just gave in, and tried a picture of myself that I happened to have on my computer at work.
I got: "Couldn't detect any faces."
There was a Billy Idol song about that.
Just general incompetence/disregard for my time and energy. I spent 5 hours waiting around to be told "oh, sorry, no, still don't know what's causing this, good luck in Montana!"
5 hours! Think of all the television I could have viewed during that time!
Speaking of meetups, I'm taking my daughters to Seattle May 17th for the Red Sox–Mariners game, and I'm thinking we'll stay overnight. Anyone up there want to join us for the game (starts at 1:00) or dinner afterwards?
It thinks I'm 36, my sister in law is 34, and my right boob is 30.
Your boobs are only six years younger than your face?
How are the photos for 92 not in the Flickr pool?
Different photo, my teeth are 50. Another one, my chin is 30 and my leg is 34.
(Tries more) Ok, it now can't find any faces and keeps selecting random bits of background.
My boob is fully clad in that pic. The dress has a pattern though.
18 - 16 years younger, anti-wrinkle efforts for the win! (but mainly genetics.)
62: my name is alameida and I approve of these menu choices.