Thursday, August 26, 2010

Projector calibration

In order to project a 3d scene on a real life environment it is necessary to match the virtual camera position with the one of the projector.

To do this there are many existing techniques using cameras and computer vision methods.
However, we developed a method that doesn't require the use of a camera.

The goal of this method is to obtain the position and orientation of the projector using a center of coordinates relative to the scene.
This problem is similar to finding the position and orientation of the chosen center of coordinates relative to the projector.

This method uses three known perpendicular points at the corners of the calibration surface that are aligned with the target coordinate system. Hence the distances between those points are known.

The projector casts rays of light and three of these rays pass through the chosen points and also through the center of the original coordinates system (inside the projector).
To obtain the equation of these rays only one point different from the origin is needed.
This point can be obtained from the screen coordinates of the pixel that is projected on each target point, for example, by clicking on them.

Using the screen coordinates of the point and some known projector parameters like the resolution and the projection angle it is possible to find the 3D coordinates of the point and then the equation of the ray.
Now that the equations for the three rays are known, we need to find the parameter t where the ray meets the corner of the target area. This can be written as a system of equations using the ray formula and the dimensions of the target area.

a, b, c are the rays. j, k, l are the width, height and diagonal of the surface
The goal is to find the parameters t0, t1 and t2 that satisfy the non-linear system.
An approximation for this solution can be obtained using numerical methods. At the moment our implementation uses the trust-region-dogleg method from Matlab.

Once the values for t0, t1 and t2 are known, the coordinates for the corners of the target area are a(t0), b(t1) and c(t2). The base for the target coordinate system is

The location of the projector relative to this base of coordinates is obtained by projecting a(t0) on this base.

Saturday, July 31, 2010


Structure from Stereo- A Review

Stereopsis is a passive technique, the triangulation needs to be achived with the help of only the existing ambient illumination.
Hence a correspondence needs to be established between features from two images that correspond to some physical feature in space.
Then, provided the position of centers of projection, the effective focal length, the orientation of the optical axis, and the sampling interval of each camera are known, the depth can be reconstructed using triangulation.

The principal steps are: Preprocesing, establishing correspondence and recovering depth.

Preprocesing: in this step image locations satisfying certain well-defined feature characteristics are identified in each image.

Establishing correspondence - Matching: Given two or more views of a scene, correspondence needs to be established among homologous features(features that are projections of the same physical identity in each view).

Recovering depth: using trigonometric functions to obtain the depth.

They present diferents techniques for recovering the 3-D structure of a scene from analysis of stereo images.

A problem in Stereo is resolve the correspondence problem due to occlusion, most scenes contain regions which appear in only one of the two images. We call these regions half-occluded, or unmatched.

Aplication of stereo: cartography, aircraft navigation, autonomus land rovers, robotics, industrial automation and steromicroscopy.

Friday, July 9, 2010


Realtime Video Studio for Professional VJs.

VDMX5 is a program that lets you assemble custom realtime video processing applications.

- Layers are the basic building block of VDMX. Movies, pictures and Quartz Composer documents could be displayed on layers.
The layers are composited with each other to obtain the resultant image.
- Modular Architecture, building by loading plugins from an ever-expanding list of available add-ons. Any setup can be saved and instantly restored, making it possible to switch between sets on the fly.

As Modul8 the mapping is realised manualy, see an example.

VDMX is a proprietary software.

Wednesday, June 23, 2010


One of the most popular software for VJs

Modul8 is a MacOS X application designed for real time video mixing and compositing. It has been designed for VJs and live performers.

Real Time Video Mixer and a user interface for real time- Is thought for real time video.
Is based on a layer metaphor, each change can be seen immediately in the composition.

Each media is a layer that can be moved, resizaed an rotated.

Has 7 outputs plus one for the user interface, you can determine what region of the composition is sent to which projector or screen.

Is posible extend the system incorporating new modules, writing in Python script.
Is an accessible on-line library more of them free.
Modul8 is a proprietary software.

Wednesday, June 16, 2010

Differents structured light techniques

Structured light is the process of projecting a known pattern of pixels on to a scene, the displacement of the stripes allows for an exact retrieval of the 3D coordinates of any details on the object's surface (depth and surface information).

A structured-light 3D scanner is a device for measuring the three-dimensional shape of an object using projected light patterns and a camera system.

This paper presents a comprehensive survey on coded structured light techniques.
The patterns are specially designed so that codewords are assigned to a set of pixels. Every coded pixel has its own codeword, so there is a direct mapping from the codewords to the corresponding coordinates of the pixel in the pattern.The codewords are simply numbers, which are mapped in the pattern by using grey levels, color or geometrical representations.

Shows pattern projection techniques classified in three groups according to their coding strategy:
  • time-multiplexing - generate the codewords by projecting a sequence of patterns along time, so the structure of every pattern can be very simple
  • neighborhood codification - represents the codewords in a unique pattern
  • direct codification - define a codeword for every pixel, which is equal to its grey level or color

Time-multiplexing strategy
A set of patterns are projected onto the measuring surface, the codeword for a given pixel is formed by the sequence of illuminance values for that pixel across the projected patterns. The codification is called temporal because the bits of the codewords are multiplexed in time.
This kind of pattern can achieve high accuracy in the measurements. This is due to two factors:
1- since multiple patterns are projected, the codeword basis tends to be small (usually binary) and therefore a small set of primitives is used, being easily distinguishable among each other;
2- a coarse-to-fine paradigm is followed, since the position of a pixel is being encoded more precisely while the patterns are successively projected.

They clasify the different techniques based on time-multiplexing as:
a) techniques based on binary codes: a sequence of binary patterns is used in order to generate binary codewords
b) techniques based on n-ary codes: a basis of n primitives is used to generate the codewords
c) Gray code combined with phase shifting: the same pattern is projected several times, shifting it in a certain direction in order to increase resolution
d) hybrid techniques: combination of time-multiplexing and neighborhood strategies

Spatial neighborhood
Tend to concentrate all the coding scheme in a unique pattern
clasification of techniques:
a) strategies based on non-formal codification: the neighborhoods are generated intuitively
b) strategies based on De Bruijn sequences: the neighborhoods are defined using pseudorandom sequences
c) strategies based on M-arrays: extension of the pseudorandom theory to the 2-D case

Direct codification
The entire codeword for a given point is contained in a unique pixel. In order to achieve this, it is necessary to use either a large range of color values or introduce periodicity.In theory, a high resolution of 3D information can be obtained.The sensitivity to noise is very high because the "distance" between "codewords", i.e. the colors used, is nearly zero.Moreover, the imaged colors depend not only on the projected colors, but also on the intrinsic color of the measuring surface.

They discuss of two groups of methods:
a) codification based on grey levels: a spectrum of grey levels is used to encode the points of the pattern
b) codification based on color: these techniques take advantage of a large spectrum of colors.

After clasification they implement some methods, compare them and present their resuls.

Sunday, May 30, 2010

Structured light - Continuity vs Discontinuity

A third experiment with structured light was performed trying to scan a simple and very used scene in the video-mapping field: a group of primitive shapes.
We faced a lot of problems with this apparently simple scene and that made us think that structured light wasnt suitable for discontinuous shapes, so we built a complex but continuous geometry by wrapping the scene with a piece of cloth. The SL-applet had no trouble generating the 3d geometry.

After getting these results we asked the creator of the applet for confirmation on our early conclusion and we got a response. It was true that the experiment performed wasnt suitable for discontinuous shapes because it uses phase-shifting scanning based on the principle of propagating depth values across a surface. So if two surfaces are disconnected, it cannot determine how they are related depth-wise.
However, it wasn't true that SL is inappropriate for discontinuous shapes. Other pattern codifications and algorithms not so based on real-time can return better results for these type of scenes.
This is what we'll be working on the next weeks.

Photos of the scenes - discontinuous and continuous

Thursday, May 20, 2010

Second scanning

On this second experiment with structured light we scanned a simple scene: two perpendicular walls.
A lot of video mapping applications like this one focus on objects with planar faces like boxes so the idea of this experiment was to test structured light with these type of objects.
The geometry was fairly well retrieved with minor tweakings. The pictures were only cropped around the target zone and the Z scale and Z skew were adjusted in the SL applet.
Here are the pictures taken and a snapshot of the retrieved cloud of points.

Wednesday, May 19, 2010

Saturday, May 15, 2010

First tests

Last Saturday we could finally get a projector and started making the first experiments of mapping and scanning.

Manual mapping

The goal of this experiment was to project two different images on two perpendicular walls (a corner of the room). The projector was not aligned with any of these walls so any planar projection resulted in deformed images. With a 3D software two textured planes were created simulating the target planes. Then the corner vertices were manually adjusted until the projected images looked undeformed on the walls.

Remark: The deformed planes in the 3D software didn't represent the shape of the walls but a planar deformed shape that matches the projection. That lead us to think that we could accomplish the goal of project over real 3D structures using simplified 2D representations of that 3D object.

This first mapping was manually performed. Programs like modul8 map the images to the geometry manually.
In others like vvvv the definition of the geometry is needed to generate the virtual model. Then this virtual model is adjusted with the real one.
This last automatic approach is what we want to achieve.

3D Scanning

The second experiment was scanning objects using the structured light technique.
The goal was to obtain three photos projecting the three-phase patterns seen in the link above. These photos and the mentioned patterns are then processed by the application to generate a cloud of points that represents the retrieved 3D geometry.

Many configurations were tested using different camera positions and different distances from the projector to the target object. We also started using an standard web camera but we were having very bad resolution in the captured images and we had problems focusing the scanned object/person (it was too "far" in the image with the corresponding loss of detail).

The best results were obtained using a photo camera of 5 mega pixels and optical zoom x12 and with the camera field of view similar to the projector's. Also the pictures were cropped with an image editor so the applet focuses only on the target object.

The calibration of the projector seemed to be very important so the pattern projected looks as sharp as possible. That, summed up to the fact that we had to use a high resolution photo camera instead of a low resolution web camera, lead us to think that the structured light algorithm requires images with a mid-to-high resolution
Another conclusion, taken directly from the obtained results, is that the objects to scan shouldn't have glossy surfaces because the patterns are lost on those areas.

Taking into consideration the conclusions expressed above, we tend to think we will have problems implementing a real time solution of 3D reconstruction using standard market cameras.

See here the obtained pictures of the reconstructed Javier's (jefa) head.

Recognized 2.5D shape:

Tuesday, May 11, 2010

Dynamic Projection Environments for Immersive Visualization

This paper present a system for dynamic projection.
the projected surfaces are large screens (in human-body scale), each one has wheels that allow move it easyly. When the projection surfaces are moved in real-time, the application re-calculates the visualization on the fly.

They use a technique they call projection keyframing to provide continuity on moving surfaces
while waiting for simulations to complete.

The system allows multiple users to participate interactively with each other and the visualization application.

Initial target application for the system was interactive
architectural lighting visualization, they give a simulated environment to architects and

clients to evaluate the natural and artificial lighting of a proposed architectural design.

The distributed system allows for:
- Projector keyframing - a technique to impart slow applications with a dynamic, responsive feel
- Tracking projection surfaces of known geometry with simple IR-based LED markers, and
- A distributed rendering system which can be extended to drive an arbitrary number of projectors.

The system steps:
- The system use a single camara to obtain images of the scene
- determine the projection surface geometry whit the information of the camera
- use a gigabit-Ethernet connected camera to detect the LEDs sensor (ubicated in the top of each
- dinamic projection using 10 projectors

- architectural visualization
- explore volumetric data by defining cross-sections
- general purpose user-interace elements

see a video with examples here

Thursday, May 6, 2010

VVVV is a toolkit for real time video synthesis

It is designed to facilitate the handling of large media environments with physical
interfaces, real-time motion graphics, audio and video that can interact with many users simultaneously.

vvvv uses a visual programming interface. Therefore it provides a graphical programming language for easy prototyping and development.

vvvv is real time. where many other languages have distinct modes for building and running programs, vvvv only has one mode - runtime.

vvvv is free for non-commercial use.
Download -->

here explains how to project on 3D Geometry
they explains how vvvv resolve projection on a flat surface and projection on an arbitrary surface.
In the first case vvvv use homography to resolve the projection.
In Second case they build a virtual replica of the real scene, whith three steps
1- defining the origin for your real worlds coordinate system
2- create the target projection surface as a 3d model and place it correctly in your virtual scene regarding the coordinate systems origin (they comment that could be done whith vvvv or other toolkit)
3- measure the position, orientation and lens-characteristics of the projector

all done in these three steps is manually by the user, using vvvv toolkit or other.

Wednesday, May 5, 2010

Fast 3D scanning methods for laser measurement systems

In this paper (writed by Oliver Wulf, Bernardo Wagner) present a laser time-of-flight method, it can provide distance measurements at 50 meters with error of centimeters.

They combines a 2D laser scanner with a servo drive, it allow different arrangements of scan planes and rotation axis that lead to different fields of view.

First defines 4 diferents combinations of the scanner and servo drive, then discuse the measurement density of each one and the measured points are not placed ina regular grid, the density is minimal for laser beams orthogonal to the rotation axis and maximal for beams parallel to this axis

in this case there are two regions
with high measurement density

Then they show diferences in a experimental comarison, whith de same number of points and the same time of scanning the result was diferent, the region more detailed correspond whit the region of mayor density.

They show what case is better for indor or outdor scan depending in the choosed method (combination of scanner and axis of rotation).

To calculate 3D point cloud is needed a transformation whith input: 2D raw scan and the position of the 2D scanner.

Posible aplications for 3D laser scanners are: object localization and recognition for an automated system, safety systems, surveillance, navigation etc.

Fast 3D Scanning with Automatic Motion Compensation (Stereo approach)

An intrinsic problem of phase-shifting methods is the inability to deal with blurred images caused by motion of the scanned object or person. There are a lot of initiatives or modifications performed to the original structured light method to support scanning of objects in motion. Please refer to Zhang's paper "Recent progress on real-time 3D shape measurement ... - Song Zhang" mentioned in the previous post. There Zhang cover some techniques to deal with blur by motion images with acceptable results.

In the paper presented below, the authors have decided to replace the unwrap phase of the structured light method by an stereo-based approach to have the same problem of correspondence solved but by a different mechanism. They argued that the unwrapping phase does not solve the absolute phase and that if two surfaces been scanned have a discontinuity of more than 2pi then no method based on phase-shifting will correctly unwrap these two surfaces to each other.

* Fast 3D Scanning with Automatic Motion Compensation - Thibaut Weise, Bastian Leibe and Luc Van Gool

We present a novel 3D scanning system combining stereo and active illumination based on phase-shift for robust and accurate scene reconstruction. Stereo overcomes the traditional phase discontinuity problem and allows for the reconstruction of complex scenes containing multiple objects. Due to the sequential recording of three patterns, motion will introduce artifacts in the reconstruction. We develop a closed-form expression for the motion error in order to apply motion compensation on a pixel level. The resulting scanning system can capture accurate depth maps of complex dynamic scenes at 17 fps and can cope with both rigid and deformable objects.

Real-time 3D shape measurement

Structures light as a technique for 3D reconstruction has being extensively adopted by the industry and it has proven to work in controlled environments. On the other hand, there's a lot of preoccupation today about the performance of phase-shifting algorithms mostly when real-time 3D measurement comes into play.

There are several directions the researchers are heading to have the performance of the algorithms improved, for example the ussage of different techniques when projecting the encoded stripes, delegate some calculations to the GPUs, or improve the mathematical model itself.

The following two papers present recent research about how to improve the computational cost of the phase-shifting algorithm, and both authors tackled the problem improving the mathematical model approximating the Arctan in the formula of the phase with an intensity ratio calculation and the use of a lookup table (LUT) to compensate the approximation error.

Actually, if you look at class of the Structured Light source code, you'll notice that there’s a comment within the code suggesting to do what these papers are proposing instead of using atan2 Java function:

public void phaseWrap() {
// this equation can be found in Song Zhang's
// "Recent progresses on real-time 3D shape measurement..."
// and it is the "bottleneck" of the algorithm
// it can be sped up with a look up table, which has the benefit
// of allowing for simultaneous gamma correction.
phase[y][x] = atan2(sqrt3 * (phase1 - phase3), 2 * phase2 - phase1 - phase3) / TWO_PI;

* Fast three-step phase-shifting algorithm - Peisen S. Huang and Song Zhang - 2006

We propose a new three-step phase-shifting algorithm, which is much faster than the traditional three step algorithm. We achieve the speed advantage by using a simple intensity ratio function to replace the arctangent function in the traditional algorithm. The phase error caused by this new algorithm is compensated for by use of a lookup table. Our experimental results show that both the new algorithm and the traditional algorithm generate similar results, but the new algorithm is 3.4 times faster. By implementing this new algorithm in a high-resolution, real-time three-dimensional shape measurement system, we were able to achieve a measurement speed of 40 frames per second at a resolution of 532x500 pixels, all with an ordinary personal computer.

* Recent progress on real-time 3D shape measurement using digital fringe projection techniques - Song Zhang - 2009

Over the past few years, we have been developing techniques for high-speed 3D shape measurement using digital fringe projection and phase-shifting techniques: various algorithms have been developed to improve the phase computation speed, parallel programming has been employed to further increase the processing speed, and advanced hardware techniques have been adopted to boost the speed of coordinate calculations and 3D geometry rendering. We have successfully achieved simultaneous 3D absolute shape acquisition, reconstruction, and display at a speed of 30 frames/s with 300 K points per frame. This paper presents the principles of the real-time 3D shape measurement techniques that we developed, summarizes the most recent progress that have been made in this field, and discusses the challenges for advancing this technology further.

Tuesday, May 4, 2010

Coded Structured light as a technique to solve the corresponding problem

This paper covers the motivation, history and different techniques regarding the Structured Light and Coded Structured light methodologies for 3D surface reconstruction.

First, a passive stereovision system with two sensor/cameras is explained and the mathematical equations and geometrical constraints are analyzed in detail. Then, structured light as an active method is covered and presented as an alternative to solve the "correspondence problem". Mathematical model is explained as well. Then, the purpose of the coded structured light is described, analyzing temporal dependence, emitted light dependence and depth surface discontinuity dependence. Finally several coded structured light techniques are covered, discussed and compared.

Active and passive techniques are covered in separate, and then, when the mathematical models are explained, the similarities are remarked.

Even though this paper is quite old (1998) it covers structured light and vision systems for 3D reconstruction from an historical perspective. That makes it a very useful source of information in order to understand structured light as a whole, and to be included, in the end, in our Sate of the Art document.

We present a survey of the most significant techniques, used in the last few years, concerning the coded structured light methods employed to get 3D information. In fact, depth perception is one of the most important subjects in computer vision. Stereovision is an attractive and widely used method, but, it is rather limited to make 3D surface maps, due to the correspondence problem. The correspondence problem can be improved using a method based on structured light concept, projecting a given pattern on the measuring surfaces. However, some relations between the projected pattern and the reflected one must be solved. This relationship can be directly found codifying the projected light, so that, each imaged region of the projected pattern carries the needed information to solve the correspondence problem.

Automatic Projector Calibration

Johnny Lee presented on his thesis work an interesting way to automatically calibrate a projector by embedding optical sensors into the projection surface.
The procedure consists in detecting the individual pixels illuminating the optical sensors.
This is done by projecting a series of gray-coded binary patterns on the target surface. These patterns are coded in a way to detect every pixel projected on the screen. The amount of patterns to project depends on the resolution of the projector. For a 1024x768 resolution each pixel can be uniquely detected with only twenty patterns.

Once the key pixels have been detected on the target surface it is posible to find the homography that transforms the screen pixels to the projected locations. Once this transformation is known, pre-warped images may be transmitted to the projector.

This project has many applications besides projector calibration.
It can be used for creating a large display using tiled projection. This is known as stitching.
It can also be used to project multiple versions of the same content on a surface to reduce the shadows when one projector is blocked.
Finally, another of the possible uses is to register the orientation of a 3D surface. This requires the geometry of the surface to be known and the optical sensors to be in the visibility range from the projector.

A video explaining the usage and other details of the implementation can be seen here.

Wednesday, April 28, 2010

Augmented Reality System

Augmented Reality (AR) is the synthesis of real and virtual imagery. In contrast to Virtual Reality (VR) in which the user is immersed in an entirely artificial world, augmented reality overlays extra information on real scenes.

Interactive Augmented Reality Techniques for Construction at a Distance of 3D Geometry

Here present a Augmented Reality system consist in techniques for construction at distance ,
uses a mobile augmented reality wearable computer, the scale of the world is fixed and the user’s presence controls the view, that can be used outdoors, the user interacts with the computer using hand and head gestures.
  • The aplication developed to this system is Tinmith-Metro.
To see videos demos and more go

Details of software architecture.

Monday, April 19, 2010

Building a three-dimensional model

One point to resolve is building a three-dimensional model from a given cloud of points, next we present papers that resolve distinct steeps in this process.

here present the problem segmented into patches each represent a discrete surface region on the physical object:
-Physical Design Model
-Digitize Model
-Cloud Data Set
-Apply Reverse Engineering Software
-Comuter based Design Model

Initialy utilize a laser-based range sensor to obtain the cloud data, then presents triangulation method and growth rules to build the mesh.

  • The complexity of the sample data has problems in: size (we need much memory to compute all data), quality (noise in the process of generate the samples) , to overcome this problems we can reduce or simplify the cloud points, this process before build the model could be made by algorithms,

In the paper Efficient Simplification of Point-Sampled Surfaces the methods presented to resolve the pre-processing problem are:

-clustering methods, split the point cloud in subsets, each subset is replaced by one representive.

-Iterative simplification, successively collapses point pairs in a point cloud according to a quadric error metric.

-particle simulation, computes new sampling positions by moving particles on the point-sampled surface according to interparticle repelling forces.

There compare and analyze each algorthms whit emphasis on efficiency and low memory footprint.

This approach is based on the representation of free-form surfaces, by building deferent meshes from each view, a curvature measure is computed at every node of the meshes and map to a spherical image.

Each mesh is represented whit a graph, each node has three neighbors and its curvature is computed from the relative position of its neighbours.

In Resume in this studies, we could review some mehtods and techniques that resolve distinct aspects in the building of a 3D model, is an introduction on the study of the problem.

Wednesday, April 7, 2010

Structured light - First experiment

An approach to the problem of scanning objects to 3D geometry is using a technique known as structured light. Kyle McDonald's instructable  clearly explains the steps needed to create a scan using this technique. The program necessary is available on google code.

Since this is a potential technique for our project, a good start would be to reproduce the experiment in  ideal conditions.
These conditions are:
  • Good lighting on the scene (only illuminating the target object)
  • Perfect camera orientation
  • A simple object to be scanned
Although structured light is robust enough to make successful scans in less strict conditions, it seems useful to have an idea of what are the best results we can expect. This can only be done in ideal conditions.

The experiment consisted of simulating the camera, projector and target object in a 3D software to generate three images. Then, using these images and the ThreePhase applet found in the google code project mentioned above, retrieve the 3D geometry.
A simple cone was used as the target object. The projector was simulated using a spot light with a projection map looking at the object in the exact same direction as the camera.
The scene was rendered three times to a bitmap using three different patterns as projector map. Said patterns can be found in the project folder. as projector map. Finally, the applet was used with these three renderings and the 3D geometry is retrieved as a cloud of points.

Experiment deployment

The first results were not ideal due to the projector and camera having the same location. Hence, the pattern projected on the target object was undeformed and no depth information could be retrieved.
To correct this, the projector was pulled away from the camera and the experiment was repeated twice using different distances.
The best results were obtained using the experimental setup shown above.

The acquired results can be used in next stages of the project as a starting point to create 3D Meshes from clouds of points.

The collected renderings using the three patterns:

Monday, April 5, 2010


Every blog starts with an introduction so this one won't be an exception.
The project is part of the computers engineering career. The people working on it are Adriana, Javier, Daniel and Tomas as tutor.
As stated in the description, this blog will demonstrate the development process of our project called "Automatic Modelling & Video Mapping". The original name in spanish is "Proyecciones sobre superficies irregulares" and the outline can be read here (spanish only).

We identified 3 major tasks:
  • Scan objects into 3D geometry
  • Allow users to edit the 3D geometry obtained in the previous step
  • Apply the needed transformations to allow video mapping on the 3D geometry and then onto the real life objects
We'll blog about different techniques related to these tasks, then we'll choose one or a few of these techniques to implement. Finally we'll put all the code together to try to make a useful and user friendly tool (open source of course)