Ever wondered how some people know the mystery pic answer? Well, some are born with supernatural powers that allows them to realize which image is it just by the colors in it, but unfortunately I was not born with them so as soon as they released mystery pic again after months of absence, I started to wonder if it was possible to make a program that solves it. Of course a program can solve it, but wanted to have a solution that took less than a minute to solve it, so that's the hard part .
Even if it sounds hard, DON'T PANIC. If you read enough and try hard, all is possible. So, after some weeks of programming and testing a looot of possible solutions, I'm finally happy with the results and wanted to share them to the community.
I did this with python, with the help of several libraries:
- PIL (pillow) -> this handles image reading convertion to rgb
- urllib -> to download images
- ThreadPool -> to make some things faster with pools
- numpy -> this does a lot of array handling in c++ to make it faster
1) Downloading an item image from the internet:
def doWork(img): # download image and save on images folder with image name urllib.urlretrieve("http://images.neopets.com/items/"+img, "../images/"+img) def downloadImages(arr): # pass array of item image names to download print "downloding " + str(len(arr)) + " images" p = ThreadPool(processes=50) # makes it faster p.map_async(doWork, arr).get(999999999)
2) Reading an image and converting to rgb:
def ImagetoRgb(name): # pass name of the image inside our images folder, open and return rgb conversion return Image.open("../images/"+name).convert('RGB')
3) Figuring out mystery pic "real" dimensions: (even if its a 150x150 px image, we actually only care about 5x5). The idea is that we check every row and every column, to check in how many pixels it changes color. When it changes, we save this count if it is lower than our max. So even if the image has repetead pixels, as we only pick the lowest distance from one color to another, we will actually pick the right count: 150/5 = 30
def getImagetoSolve(link): im = Image.open(link).convert('RGB') #open our image imx = im.load() // load pixel data minCount = 9999 # we start with a big number of our lowest pixel distance for x in range(im.size[0]): # img.size[0] is the width difColors = [] # colors we have checked on this row difColors2 = [] # colors we have checked on this column r1=1 # our current distance in row is 1 r2=1 # our current distance in column is 1 for y in range(im.size[1]): # img.size[1] is the height col = imx[x,y] # this gives the pixel in the x,y coordinates col2 = imx[y,x] # this gives the pixel in the y,x coordinates if col not in difColors: # if its a pixel we haven't seen difColors.append(col) # we add it as seen if r1 > 1: # if its not the case when we actually start, this means we reached a new color minCount = min(minCount, r1) # we update our ditance else: r1+=1 # if not, we have traveled 1 pixel more without a new color # the same but with column if col2 not in difColors2: difColors2.append(col2) if r2 > 1: minCount = min(minCount, r2) else: r2+=1 # now we know the real distance between pixels, that is to say, every x pixels there is a new pixel color tileSize = im.size[0] / minCount # tileSize represents each pixel size in the 150x150 image (30 pixels if its 5x5) minCount = im.size[0] * 1.0 / tileSize # 150x150 in a 9x9 image is not a round number, so we fix the minCount to represent exact value pixelData = [] # real data we care about (5x5 or 9x9) print link, tileSize, minCount for y in range(tileSize): # here we add each pixel we care about to our grid pixelData.append([]) for x in range(tileSize): # we pick the pixel at coordinates x*mincount,y*mincount but that might give a value with decimals so we do a little math fix pixelData[y].append(imx[math.floor(x*minCount)+1,math.floor(y*minCount)+1]) arr1=numpy.array(pixelData) # we pass it to numpy array which is much faster #return useful stuff return { "img": im, "imx" : imx, "tileSize": tileSize, "pixelData": arr1 }
4) Comparing the new 5x5, 9x9 or 10x10 pixel image we now have with an image we want to check:
This can be done with a simple pixel x pixel match, but this would take hours to solve if you want to compare against 100k images and check for rotations as well. So after a bit of searching, there is an awesome concept called integral image, which is really nicely explained on wikipedia, but the idea is that we want to match a 5x5 pixel image window over a 80x80 (as example). The idea of integral image is that instead of having a grid with all the pixels, we pass that to a grid of sum of pixels, in a way that we can easily know the sum of pixels on a 5x5 window of the 80x80 image.
So instead of checking against every pixel, we actually only check if the sum of pixels of the 5x5 image is the same as the window we are looking on the 80x80 one. If it doesn't match, there is no need to check if all pixels match. If it matches, its possible that it is a perfect match, so we add that region as possible regions to analize later pixel by pixel (note that except extreme cases, if it gives a posible region it most likely will be the match we want)
And the awesome part is that the integral image of the 5x5 one is the same as its possible rotations, as we just sum up all the pixels values
def find_image(im, tpl): #check if matches, if it does it returns the location of the match. If not, throws an error . im is our 80x80 image, tpl is our 5x5 template im = numpy.atleast_3d(im) # we make sure it has at least a depth of 3 (red,green,blue) tpl = numpy.atleast_3d(tpl) # we make sure it has at least depth of 3 (red,green,blue) H, W, D = im.shape[:3] # we get the heigh, width and depth h, w, d = tpl.shape[:3] # Integral image and template sum per channel sat = im.cumsum(1).cumsum(0) # Calculate lookup table for all the possible windows iA, iB, iC, iD = sat[:-h, :-w], sat[:-h, w:], sat[h:, :-w], sat[h:, w:] lookup = iD - iB - iC + iA # Possible matches tplsum = numpy.array([tpl[:, :, i].sum() for i in range(D)]) possible_match = numpy.where(numpy.logical_and(*[lookup[..., i] == tplsum[i] for i in range(D)])) hastoCheck = False for possibility in possible_match: # if there isn't a possible match, we don't have to check anything else if len(possibility) > 0: hastoCheck = True if hastoCheck: # if there is a possible match, we have to check pixel by pixel against the 5x5 one, and its possible rotations zipped = zip(*possible_match) tp2 = numpy.rot90(tpl) tp3 = numpy.rot90(tp2) tp4 = numpy.rot90(tp3) for tp in [tpl,tp2,tp3,tp4]: # Find exact match for y, x in zipped: if numpy.all(im[y+1:y+h+1, x+1:x+w+1] == tp): return (y+1, x+1) raise Exception("Image not found")
Between this and a bit of threading, I was able to match a 5x5 image against 45723 item images I obtained in just 12 seconds using 100% cpu. Took more to read the images and start the threads than the actual matching which takes less than 0.02 seconds each. Using 50% cpu it took 20 seconds
So this awesome method of image matching is great for our purpose and quite useful for a lot of things in general when we want to match something exactly in an efficient way
However, all this image matching is not actually really necessary to match in less than a minute. Another choice is using a database and storing the images in a way we can filter the ones that do not even have the pixels that the 5x5 image has. Why check an image with no blue pixel if the 5x5 one has a blue pixel? That's the general idea, and I tried over 10 methods of doing probability filters and things like that but they all took a lot (30-60 seconds, which is more than manually matching) and besides were a pain to set up the data on the db.
Finally, in my last try I decided to go with a pretty straightforward and simple structure, just make a row on a images table with image name and all its unique pixels data. Then, I check the unique pixels of the 5x5 image and query the images that have all those pixels on it. With this, the program took 10 seconds and the actual query on mysql took 3-4 seconds.
The best part of this approach is that it will work better when you need to match against more than 40k images, and surprisingly enough, with the last 3 mystery pic competitions when I made the query on the db, only the picture that was the actual solution wasn't filtered out! This says a lot about the uniqueness of pixels, even if the 5x5 image had only 6 different pixels, from the 47k images I had only 1 had exatly the same pixels in them
All in all, my program is 400 lines, with both the db and integral image approach.
Feel free to ask any questions or give an idea for an ever better solving! I'm pretty sure this would solve any mystery pic in under a minute with all neo pictures though