What started out as a guest lecture for a Cognitive Science class became a sort of talk show of all sorts of exciting and fun things going on in IA these days with about tons (about 50) video clips. Here is my outline for the class. If I was forced to pick the three most interesting clips, I would choose:
- Brains, Meaning and Corpus Statistics
- Performance Capture in Avatar
- IBM’s Jeopardy playing computer Watson
Which videos strike your fancy?
Last Time: Connectionism
- Connectionism models the mind using networks of interconnected units.
- Units are organized into layers: network input, network output, hidden layers in the middle.
- An individual unit:
- can be active or inactive.
- has an activation threshold.
- has weighted connections to some units on its input side and some units on its output side.
A network:
- receives input when we activate or deactivate its input units.
- spreads activation through its layers by activating a unit if the sum of the unit’s weighted input activations exceeds the unit’s activation threshold.
- produces an activation pattern at its output layer which we read off and interpret.
Connectionism:
- is biologically inspired. (Units are a bit like neurons.)
- performance useful tasks. (We talked about word recognition last time. We’ll see more examples this time.)
- helps with theory. (Representation is robust, distributed: units can be removed without ruining the network. Similarly, neurons can die and be disabled.)
Numenta: Hierarchical Temporal Memory
From our reading, we learn Jeff Hawkins likes brains. He:
- created Redwood Center for Theoretical Neuroscience. (Biology)
- wrote On Intelligence. (Theory)
- founded Numenta. (Performance)
Biology of neocortex described in this keynote. (Start around 10:00.)
- 14:00 “… ‘cus they can’t do these experiments on humans.” – The experiments involve opening the skull and attaching sensors directly to the brain.
- 15:30 “The algorithm underlying each of these regions is the same.” – 2,000,000 such regions.
Performance mechanism described in this lecture. (Start around 11:50.)
- 19:15 “… easy to use tools.” – Trying out the Vision4 Demo is your homework.
- 20:10 “…give that a shot.” – Let’s go ahead see a video demoing VitaminD.inc’s software.
Language
Chinese Handwriting Recognition
Peter Norvig
- is Director of Research at Google.
- describes machine learning by linear separators.
- explains how to make a spelling corrector.
Unsupervised Part-Of-Speech Tagger Explained
- Part-of-speech as in noun, verb, adjective.
- A part-of-speech tagger reads some text and decides the part of speech from what it sees.
- Unsupervised means that you don’t label anything as noun, verb, etc. Instead the system develops its own categories based on how the words are used.
Machine Translation uses statistics gathered from millions of documents, trillionsh of words.
-
- called “Transcribe Audio” in the interface.
- used to improve search.
- can be translated.
- may not work well.
Dragon Naturally Speaking is the longstanding standard in machine dictation. See:
Classification
- Spam Filtering
- Netflix prize for recommending movies.
- Pandora Internet Radio provides recommendations too, but musical features (between 100-500 per song) are extracted by musicians rather than machines (takes 20-30 minutes per song).
- Shazam uses your phone to identify music (ad.
- Shazam works by comparing “fingerprints” of peak intensity over time. If two songs play the same loud notes at the same time, then they probably are the same.
- YouTube Content ID
- Google Prediction API lets you bring you run your own marked (supervised) data against their learning algorithms.
Social Software
Computers crunch data, but they need data to get started.
- Connecting Individuals: Facebook, personals, massively multiplayer online games, blogs, forums.
- Transcending Individuals
- Wikipedia aggregates knowledge. Carefully designed to prevent vandalism, control quality (e.g. indicating when an article needs help).
- Stack Overflow aggregates answers. Carefully designed to not be useless like Yahoo Answers: focus on experts, control ratings, voting, points, minimize clutter.
- Google Search uses PageRank do decide how good a webpage is.
- Advertisers try to artificially increase their PageRank by spamming your blog.
- Luis von Ahn
- explains captchas purpose and uses.
- Games With a Purpose let you solve problems while having fun.
Brain Scans
Brains, Meaning and Corpus Statistics is an excellent talk about applying machine learning to the brain and the web.
If you take anything away from this talk, I want you to take two ideas away.
The first is something we didn’t know before, that neural codings of ideas, at least for concrete nouns, are remarkably similar in our different heads. They’re not identical. We are different. But the amazing thing is that they’re similar enough that these modeling procedure can extrapolate from one person to another.
The second thing is that you can predict the neural activity in your brain when you think about a word based on how that word is used on the web. That’s exactly what this model is doing. The input to this model is “how is this word used on the web” and the output of the model is the predicted neural activity. And it works. So, there’s something fundamental about semantics that’s captured in corpus statistics.
Robots
- Jerry Pratt’s robotics videos
- Monopod hops around.
- Learning Locomotion video – notice the red lights.
- U Penn’s GRASP Lab featured in an Article about their maneuverable quadrotor with video.
- Autonomous Vehicle Driving
- Sebastian Thrun explains how his Stanford team won the DARPA Grand Challenge: have a robot vehicle race across the desert by itself.
- DARPA Urban Challenge “required teams to build an autonomous vehicle capable of driving in traffic, performing complex maneuvers such as merging, passing, parking and negotiating intersections.” (See highlight video.)
- Now Stanford has its eyes set on Pike’s Peak race.
Face Recognition (Skipped)
- Apple and Google are doing it.
- Google Goggles lets you search by image.
- Apple just bought a face recognition company called Polar Rose.
Performance Capture
- Emily describes Image Metrics.
- See Image Metrics Show Reel.
- James Cameron reveals that Avatar was acted and digitally rotoscoped.
Games
Nintendo Wii
- With the Wii, Nintendo took a risk by putting its full force into the Wii Remote motion sensitive controller. The remote uses:
- an accelerometer to sense when the remote is moving up-down, left-right, and side-to-side.
- an infrared light sensor with an accompanying Sensor Bar so that you can aim at things.
- Wii is fun. People liked it twice as much (70.9 million in sales) as the PlayStation 3 (38.1 million) and the Xbox 360 (41.7 million). (See sales figures.)
- To increase the Wii Remote sensitivity, the Wii Motion Plus adds a gyroscope to sense rotation.
- See introduction.
- See Wii Sports Resort gameplay.
- Other links:
- Motion Plus technical demo.
- Johnny Chung Lee head tracking VR. Lee went to work at Microsoft.
- With the Wii, Nintendo took a risk by putting its full force into the Wii Remote motion sensitive controller. The remote uses:
Sony PlayStation
- PlayStation Move
- works differently than the Wii Remote but provides similar capabilities as seen in this review.
- uses:
- the PlayStation Eye Camera and a glowing ball (can glow any color to stand out from the background) to detect motion.
- an accelerometer to sense linear motion.
- a rate sensor to sense rotation.
- a magnetometer (compass) to correct against drift from the other sensors by knowing which way is north.
- PlayStation Eye Camera is a nice webcam “capturing standard video with frame rates of 60 hertz at a 640×480 pixel resolution, and 120 hertz at 320×240 pixels”. (See here.)
- EyePet is an augmented reality pet with all sorts of IA. You can even teach it songs.
- PlayStation Move
Microsoft Kinect (formally Natal)
Milo
- Peter Molyneux describes how Lionhead Studios is making a game about a virtual boy Milo.
- Given people’s response to the original demo video.
IBM’s Jeopardy playing computer Watson