Monday, February 18, 2008

Toward the Hyperreality Engine Pt 3: The Software

The software for the hyperreality engine will be yet another form of specialized AI that in many ways actually be more complex and challenging than "greater-than-human" AI.

Image processing
- the type of content that is growing faster than any other are everyhones collective store of digital content, esp images and movie clips. Gone are the economics of the past, where each pic cost measurable money. Now, its really just the cost of its space on a hard disk, which is already just a fraction of a penny and rides the tech wave, getting cheaper.
- So we can take immense numbers of pictures, hundreds an hour if desired, and with camcorder footage, esp high-definition(got one a while back, 10x improvement over those that came before), but there is still a bottleneck. It’s on the consumption end – despite all the tech, you can really just view an image at a time, a clip at a time.
- That content s/b be put to work to create virtual environments. This will permit greater consumption volume, as several pics could be combined into one, with a short dynamic sequence connecting them – very cool.
- Of course, this would also allow deeper immersion in the content captured. Using footage form today's high-definition camcorders should allow photo-realistic recreations.
- And yes, I know this is tough, very demanding in terms of both software and hardware. it must be photo-realistic, and it must be as close to real time as possible. In this case, real time might be up to an hour, creating the 3D virtual env – if it was interactive enough, that amount of time might be acceptable.

This application will need a lot of AI, of a focused nature. Taking in the new images, matching against previously taken pics, automatically incorporating the new content into the existing virtual reality environment. Proactively, removing some of the time if done manually, so the user can get to experiencing the environment faster and spend more time there. Ideally, the user would modify the environment if needed from w/n the env itself, by saying something like, no, there were more clouds that day, etc. it would be ideal for them to be communicated using the user’s natural language, and this is where this s/w comes closest to hal. Simply understanding these comments, and being able to respond in a similarly natural way. As long as each users comment is remembered and used again where appropriate, this should be fine for the end user.

In other words, it is alright for an AI system to ask “dumb” questions, as long as it only needs to do it once. If it doesn’t, the user will get more annoyed using natl language than from the current ways – it must remember. And if it applies this learning incorrectly in some other case, that’s alright too, as long as when it is corrected it can discriminate between the two cases in future.

Users will be quite patient about this – as long as the software learns rapidly, preferably the first time.

This does not have to happen all at once. In fact, there is a smooth progression in capability, any of which are much more immersive than current tech.
To start with, some limited dynamic motion and sound. If there is a picture of someone at the beach, the water should be making waves, and the sound of the ocean could be played as well.

The path is one of steadily increasing photo-realism and dynamism in the picture stock. Being able to interpolate a person or thing between two photos into a short clip would be very popular. If you have a pic of someone standing up and another sitting down, the clip would move between the two with very natural flow. The person would move from a standing position to a sitting position as realistically as if it were a camcorder clip.

A real breakthrough will be when these avatars can use existing clips and then extend the model into new situations. Conversations that use words not in any clip, for instance.

No comments: