Keeping computer and technology news simple.

July 30, 2008

New Video Surveillance Technology 'Recognizes' Abnormal Activity

BRS software can establish 'normal' on-camera activity – and alert security staff when something unusual occurs

The problem with video surveillance cameras is that, usually, there are too many of them for one security staffer to monitor. In a typical large enterprise setup, a single officer might be monitoring dozens -- even hundreds -- of cameras simultaneously, making it impossible to immediately recognize suspicious activity.

"To be honest, it's sheer luck if a security officer spots something in an environment like that," says John Frazzini, a former U.S. Secret Service agent and IT security consultant. "If you get a security manager alone behind closed doors, a lot of them laugh about what a waste of money it is."

Frazzini recently signed on to serve as president of a new company -- Behavioral Recognition Systems, or BRS Labs for short -- that aims to stop that waste. BRS Labs, which is launching both its business and its technology today, has received 16 patents on a new video surveillance application that can convert video images into machine-readable language, and then analyze them for anomalies that suggest suspicious behavior in the camera's field of view.

Unlike current video surveillance gear -- which requires a human to monitor it or complex programming that can't adapt to new images -- BRS Labs's software can "learn" the behavior of objects and images in a camera's field of view, Frazzini says. It can establish "norms" of activity for each camera, then alert security officers when the camera registers something abnormal in its field of view.

"It works a lot like the behavioral software that many IT people use on their networks," Frazzini says. "It establishes a baseline of activity, and then sends alerts when there are anomalies. The big difference is that, until now, there was no way to do this kind of analysis on video images, because the data collected by the cameras wasn't machine readable. We had to invent a way to do that."

The BRS Labs software can establish a baseline in anywhere from 30 minutes to several hours, depending on how much activity the camera recognizes and how regular the patterns of behavior are. "If you're monitoring a busy highway, where traffic comes and goes frequently on a regular basis, [the software] learns very quickly," Frazzini says. "If you're monitoring an outdoor fence line when the camera sees only three or four actions all day, it will take longer."

Once the software is operational, it can "recognize" up to 300 objects and establish a baseline of activity. If the camera is in a wooded area where few humans ever go, it will alert officers when it registers a human on the screen. If it is monitoring a high fence line, it will send an alert when someone jumps the fence.

"The great thing about it is that you don't need a human to monitor the camera at all," Frazzini says. "The system can recognize the behavior on its own."

Because there are so many possible images that might cross in front of the camera, the BRS Labs technology will likely create a fair number of false positives, Frazzini concedes. "We think a three-to-one ratio of alerts to actual events is what the market will accept," he says. "We could be wrong."

Overall, however, the new technology should save enterprises money, because security officers can spend their time diagnosing alerts and less time watching their screens for anomalies. And the system is more accurate than human monitoring, he says.

"What we've seen so far is enterprises spending billions on video surveillance equipment, but having a lot of trouble proving a [return on investment]," Frazzini says. "What we're doing is helping them to get more out of that equipment."

The BRS Labs technology will be generally available in September. Pricing hasn't been finalized -- early implementations have ranged anywhere from $1,500 to $4,500 per camera.

Credits.

Cold war era hack

IBM SELECTRIC TYPEWRITER... Because the Selectric coupled a motor to a mechanical assembly, pressing different keys caused the motor to draw different amounts of current specific to each key. By closely measuring the current used by the typewriter, it was possible to determine what was being typed on the machine. To prevent such measurements, State Department Selectric typewriters were equipped with parts that masked the messages being typed.


View slideshow

Credits.

July 24, 2008

An Algorithm to Turn Melodies Into Songs

New software creates chord progressions to accompany singers of all abilities

PHOTO: Emily Shur/Getty Images

11 July 2008—We all occasionally catch ourselves humming a tune, or singing along to the radio. What separates most of us from real musicians is the knowledge and skill to turn a hummed melody into a complete song. Three researchers in Washington state, however, aim to bridge at least some of that gap.

They've created a program called MySong that can generate a chord progression to fit any vocal melody. You simply sing into a computer microphone to the beat of a digitized metronome, and MySong comes up with an accompaniment of chords that sounds good with it. “Lots of songs have only three chords,” says Sumit Basu of Microsoft Research, a cocreator of MySong. “If you have the melody, it seems like you ought to be able to predict what the chords are.” Basu and his collaborators—Dan Morris, a Microsoft Research colleague, and Ian Simon, a Ph.D. student at the University of Washington, Seattle—will show off some of their program's features at The Twenty-Third AAAI Conference on Artificial Intelligence next week.

For any given melody, there's no such thing as a “correct” chord progression. But we tend to like songs with patterns we're used to hearing. When a musician begins fitting chords to a melody, the choices are guided by a lifetime of listening to other songs and figuring out why they sound good. MySong's creators gave their program a similar musical education by assembling a library of nearly 300 songs in the form of lead sheets, sheet music where written-out chords—such as C major, A minor, G major—accompany a single melody line. (For examples of lead sheets, check out Wikifonia, whose songs form the basis of MySong's library).

By analyzing this library, MySong creates probability tables that calculate two factors. For a given chord, which chords are most likely to proceed or follow it? And for each chord, which melody notes are likely to appear with it? These probability tables essentially give MySong mathematically derived music theory from pop music itself.

After a user records a vocal track, MySong initially provides two ways to alter the generated chord progressions, both in the form of slider bars. One slider affects how much weight each of the two probabilities is given: whether a chord best matches the vocal melody notes or whether it best fits in with the chords that surround it. The second slider allows the user to weight between major (happy-sounding) and minor (sad-sounding) keys.

“Typically in other machine-learning approaches, the blending would be fixed by whoever develops it,” says Morris. “Rather than fixing it in our code, we put that on a slider and exposed it to the user as an extra creative degree of freedom.” This concept of opening to amateur users the hidden mathematics of the underlying model is one of the topics the researchers will discuss at the AAAI conference. “With just a few clicks you can actually do a lot of manipulation and get a lot of different feels to accompany what you're doing,” adds Basu.

The technology behind MySong goes beyond simple modeling of song structure: before it can do anything else, it first has to make sense of the imprecise acoustic sounds we call singing. “The voice is a real challenge,” says Basu, who notes that most people, including trained singers, use some vibrato, where the voice vibrates above and below the intended note. Although computers can accurately track pitch, it's difficult for them to determine the exact notes the singer meant to sing.

MySong's team realized that they could work around the complexities of the voice by requiring the user to pick a tempo and sing along with a digitized metronome so that the melody maps to uniform units of time. MySong takes frequency samples 100 times a second, and, rather than trying to accurately assemble a melody from those frequencies, it keeps track of how often each frequency appears during a user-defined number of beats. This quantity over quality approach means that the chord selected for a measure best fits those notes hit most often or held the longest.

MySong's novelty, says Christopher Raphael, an associate professor of informatics at Indiana University, is in this ability to avoid technical problems rather than solve them. He also sees promise in the program's ability to engage novices, a sentiment echoed by Gil Weinberg, director of music technology at Georgia Tech. “I believe everyone has the ability to express themselves by singing songs or banging on something,” says Weinberg. “What's nice about the voice is that you don't even need an object to bang on.” Weinberg says he hopes that MySong will provide a gateway for further learning. “Many students just don't get to the expressive and creative portion of music, because there's so much technique and theory in the beginning,” he says.

Basu, Morris, and Simon may write software by day, but they each spend much of their free time writing and performing music. “We're kind of casual, amateur musicians who love making music,” says Morris. In one test of MySong's abilities, they pitted some of the chord progressions they wrote for their own songs against those chosen by the software. A blind jury of experienced musicians consistently rated MySong's chord progressions nearly as high as it rated those chosen by human musicians.

For Basu, such success was bittersweet triumph. “It made me feel proud for MySong, but these were the chords I had slaved over,” he laughs. “But when I listened to what MySong had chosen, it was often more interesting that what I had done.”

Since developing MySong, Basu used the program to assist him in the early stages of songwriting. “Dan [Morris] always jokes that I'm the one user of MySong,” says Basu. “There are progressions I use now in my music that I learned from MySong.”

As a user and creator, Basu emphasizes that MySong is meant for creativity assistance rather than creativity replacement. “The creative spark still has to come from people,” he says, “and one of the things that makes me feel better as a musician is that there's more to music than just the chords you choose.”

Microsoft has not yet decided to commercialize MySong, but the team hopes to improve its core modeling algorithms, as well as provide more user control over which libraries the program relies on. Basu imagines users being able to select chords based on libraries of specific musicians. A slider, for instance, that could blend chord-progression styles between the jazz singer Ella Fitzgerald and the metal band Slayer.

Listen to audio samples created by MySong users and judge the chords for yourself at http://research.microsoft.com/~dan/mysong/

Credits.

Microsoft Engineers Invent Energy-Efficient LCD Competitor

Telescopic pixel display lets more light out than an LCD

PHOTO: Anna Pyayt

21 July 2008—Researchers from Microsoft say they've built a prototype of a display screen using a technology that essentially mimics the optics in a telescope but at the scale of individual display pixels. The result is a display that is faster and more energy efficient than a liquid crystal display, or LCD, according to research reported yesterday in Nature Photonics. 


Anna Pyayt led the research as part of her Ph.D. thesis at the University of Washington in collaboration with two Microsoft engineers. Microsoft funded the work and has also applied for a patent on the technology.

The most common display technology, the LCD, is inefficient. The display is lit from the back, and less than 10 percent of the light reaches the surface of the screen. Pixels in a display technology work as on-off shutters, but the light has to travel through several layers before reaching the screen. In an LCD, one of those layers is a polarizing filter, which absorbs about 50 percent of the light as it passes through.

By contrast, the telescopic pixel design uses reflective optics. Each pixel functions as a miniature telescope. There are two mirrors: a primary mirror facing the backlight (away from the screen) with a hole in the middle, and a smaller secondary mirror 175 micrometers behind the primary mirror it faces. The secondary mirror is the same size and shape as the hole. Without an electric field, the mirrors stay flat, and light coming from behind the pixel is reflected back, away from the screen. But applying voltage bends the primary mirror into the shape of a parabola. The bending focuses light onto the secondary mirror, which reflects it out through the hole in the primary mirror and onto the screen.

The design greatly increases the amount of backlight that reaches the screen. The researchers were able to get about 36 percent of the backlight out of a pixel, more than three times as much light as an LCD can deliver. But Microsoft senior research engineer Michael Sinclair says that through design improvements, he expects that number to go up—theoretically, as high as 75 percent.

The telescopic display can also switch its pixels on and off faster than an LCD can, going from dark to light and back again in just 1.5 milliseconds, about six times as fast as a typical LCD pixel.

Researchers not associated with the study also see promise in the technology, particularly because it does not compromise picture quality for power efficiency. “This novel approach for transmissive displays is highly attractive because it can provide high-efficiency analog gray scale,” says Jason Heikenfeld, assistant professor of electrical engineering at the University of Cincinnati and director of the Novel Devices Laboratory. Most microelectromechanical systems display technologies—such as Texas Instruments' digital micromirror devices—are digital; they are either on or off. In the telescopic pixel design, the amount of light emitted is a function of voltage. The mirrors act like springs—when you apply more voltage, they bend further, reflecting more light to the screen.

Other reflective displays that have tried to improve efficiency have not been popular with consumers because they are not very bright, says Heikenfeld. So a technology like the telescopic pixel design may serve to satisfy demands for power efficiency and image quality, he adds. However, the telescopic pixels require high voltages to operate—up to 120 volts—and Heikenfeld believes that Microsoft will have to reduce that voltage for a commercial product.

One potential concern about the technology may be its durability, because of the constant bending and movement of the mirrors. A durability test has not yet been done, says Microsoft's Sinclair. However, he adds that the group did produce an array of pixels that performed without any glitches, a sign that the technology can be manufactured. “It shows definite signs of a future,” he says.

Pyayt agrees. “It's not a final, perfectly working system, but it's in progress, and I believe it's possible to optimize it to be fully functional,” she says.

The technology is still in its nascent stages, and the project is unusual for Microsoft, which is not in the display business. Sinclair says there is a possibility that Microsoft will collaborate with a display manufacturer, but commercial production is at least five years away.

Credits.

July 11, 2008

Direct video manipulation interface

Direct manipulation of video is one of the more uncanny HCI concepts I've ever seen. Instead of manipulating time with a traditional scrubber bar, the user can drag objects in the video across their path of movement. Nothing in the video actually changes, but the perception is that you can directly manipulate the objects in the video stream by pulling them around through time.

There's a Windows application called DimP which implements this interface. When you hover over a movable object in the video, a light path appears that emphasizes the object's motion curve, which you can then move the object across. From the DimP website:

So what's being manipulated, exactly? Both the video content (e.g., the things you see moving in the video) and the "tape head". When using DimP, the user directly manipulates the video content and indirectly manipulates the tape head. When using the seeker bar, the user directly manipulates the tape head and indirectly manipulates the video content.

The video above describes how DimP works in a bit more detail, showing a few different video scenarios where direct manipulation really shines. It's intuitive and bizarre at the same time. If the universe is completely deterministic, I can't help but think this is what time travel must look like.

DimP - A Direct Manipulation Video Player
DRAGON - Direct Manipulation Interface Demo for OS X

Reference.

Previous entries: