Real-time cinematography for games pdf
Each idiom demand human interpretation of the scenes to be correctly operates as a state machine and defines the scene shots to be applied. However, cinematographers have defined some used. Halper et al. Some examples are: upon constraint specifications; however high constraint satisfaction implies in poor frame coherence. Charles et al. In plot-based applications, Courty et al. In this keep itself in that side, not making unexpected work, we try to contribute towards this goal by proposing a movement shots.
The camera can switch sides, but novel approach for the architecture and implementation of a only upon an establishing shot, that shows that cinematography virtual director. With the advancement of industry and the must be broken in two shots at least; emergence of new technologies in digital video with high definition formats, the tern are expanded.
Now it is IV. The Scriptwriter is responsible for controlling the aesthetically pleasing images and the technical aspects plot and the story flow; the Scenographer is responsible for involved with using cameras, lights, and other equipment creating and arranging the sceneries; the Director defines [18]. Figure 1 shows a frames, it is often helpful to think of a film as having a diagram of the architecture. At the highest level a film is a sequence of scenes.
He visualizes the script by giving to abstract concepts a concrete form. The director establishes a point of view on the action that helps to determine the selection of shots, camera placements and movements. The director is responsible for the dramatic structure and directional flow of the film. In our system, the role of the director is to choose which shot should be used at each time to highlight the scene emotion and to present the content in an interesting and coherent manner.
To perform this task, the director uses a collection of support vector machines trained to classify the Figure 1. System architecture. The process consists of two steps. First, the training The main component of our architecture and the focus of process, which is done before the story dramatization, this work is the Director.
It concentrates the cinematography consists in simulating some common scenes and defining the knowledge and decides, in real-time, the best way to present solution for the shot selection.
The features of these scenes, scenes. The knowledge is represented by means of several actors and environment are used to teach the support vector support vector machines trained to solve cinematography machine how to proceed in this situation in order to detect problems involving camera shot selection. Support vector similar situations in the future. The second step is the machines are used as an effective method for general prediction process that is done in real-time during the purpose pattern recognition; they are based on statistical dramatization by using the knowledge acquired through the learning theory and are specialized for small sample sets training process to predict classify an unknown situation.
A similar approach is used by Passos et al. Support vector machines have better important features from the environment, scene, and generalization than neural networks and guarantee local and involved actors.
The output is the selected shot that best global optimal solutions similar to those obtained by neural matches with the input features, as shown in Figure 2.
In recent years, support vector machines have been found to be remarkably effective in many real-world applications such as in systems for detecting microcalcifications in medical images [8], automatic hierarchical document categorization [2], spam categorization [7], among others.
In our system, the modules are agents that communicate with each other by means of message exchange and can be summarized as follows: 1. The Scriptwriter reads the information about the current scene from the story plot and sends it to the Scenographer; 2. The Scenographer prepares the actors and scenario for the scene dramatization and also places objects and involved actors in the scene.
The information about the scenario is sent to both the Cameraman and the Director; 3. The Cameraman, following cinematography rules, Figure 2. Support vector machine input and output. The Director extracts from the scene all important A. Support Vector Machine data and applies them to a support vector machine The support vector machine, proposed by Vapnik [22], is to select the best shot for the scene.
This a powerful methodology for solving machine-learning information is then sent to the Cameraman; problems. It consists of a supervised learning method that 5. The Cameraman activates the shot selected by the tries to find the biggest margin to separate different classes Director and, if necessary, executes a camera of data.
Kernel functions are employed to efficiently map movement or zooming operation. The original idea of SVM is to use a linear separating The optimal hyperplane is the one that satisfies the hyperplane to separate the training data set into two classes. The support vectors are located on the separating margins and are usually a small subset of the training data set, denoted by X SVM.
Optimal hyperplane separating two classes. In and the separate margins as: other words, this process corresponds to finding which side of the hyperplane the unknown vector belongs. For example, in Figure 4, the separation requires a curve that is more complex than a where, simple line. This set of vectors is separated by the optimal hyperplane if and only if it is separated without error and the distance between the closest vector and the hyperplane is maximal.
Non-linearly separable classes. First, binary pattern classification. For our problem, a multi-class it allows training errors. In this higher space, it is possible scene. To solve this problem, we use the "one-against-one" approach [15] in which classifiers are constructed and each that the features may be linearly separated [23]. Then the one trains data from two different classes, creating a problem can be described as: l combination of binary SVMs.
In order to train the support vector machines, we simulate the training errors into account. If the data are linear some common situations that happen in real films. This training database is created l 1 l l once and is used in all future dramatizations. The emotional state in most cases replaced by some special kernel functions. Some popular influences the selected shot to highlight the kernels are radial basis function kernel and polynomial emotional actor state.
This feature is the most Figure 4, the classes need to be mapped and rearranged using important because the actor must be visible in the a kernel function in a high-dimensional space. After the shot. The emotional state happiness is, for example, represented by the value 1, sadness by the value 2.
All features are then normalized between -1 and 1. The classes are the possible shots camera angles for the scene. These shots are defined in our system by the Cameraman module, which, for each scene, creates a line of action and positions the cameras in an appropriated location, improving the scene visualization by following standard cinematography rules and patterns proposed by Arijon [].
For this scene, we can extract 9 features: the position Figure 5. Classes mapped and rearranged to become linearly X, Y and Z of the two actors 6 features , the emotional state separable. Director Architecture. Predicting Process With the support vector machines trained with cinematography knowledge, the Director module is able to act as a film director and, based on the previous experience, select in real-time the best shots to show the scenes.
The result of our support vector machine is the camera shot classified as the best solution to show the scene. The scenes are composed by different shots; the transition between the shots occurs when an important event happens in the scene, for example when the emotional state Camera D Camera B of an actor changes or when an actor executes an action. The director detects in real-time these events and executes the predicting process to use the support vector machine knowledge to choose the new shot.
Consider a scene where the actor chases an animal Figure 8. We have two possible shots for this scene: camera A and camera B. The director detects in real-time the type of the scene and activates the support vector machine for chasing scenes. Every time when a new support vector machine is selected an initial shot must be selected, so the Camera E director extracts from the environment the features used by the active support vector machine and apply these features to Figure 6.
Possible camera shots for a dialog scene. Figure 7 illustrates this combination of When a new important event occurs, for instance, while support vector machines. The director has N support vector along the chase the actor speaks something, the director machines and each one with different inputs and outputs.
The training sets are used to train the support vector machine and the samples of the current test set are predicted. Correct and A B wrong predicted shots are then computed. Table 1 shows the computed results of this test with the training set size ranging from 10 to 55 samples. The presented percentages of accuracy correspond to the average of the results obtained for the different support vector machines.
Camera A Camera B Figure 8. Possible camera shots for a chasing scene. The second test is the recognition rate, to check the accuracy of the predicted shots. The tests have occurred on an Intel Core 2 Quad 2. Recognition rate with different training sets.
To test the performance of our proposed solution, we It is clear that the computational cost grows almost trained our support vector machines with a different number linearly with the number of samples. More samples result in of samples and use them to predict the shots for a sequence a high accuracy but in slow recognition; few samples result of 6 scenes, with a total of approximately 40 different shots.
However, with For each shot, we calculate the necessary time for the small training sets we obtain high percentage of correct prediction process. Figure 9 shows performance results in a recognition of the best shots, ensuring high accuracy in the line chart with the training set size ranging from 10 to 55 shot selection and without high computational costs.
In this paper we have presented an intelligent cinematography director that uses a collection of support vector machines trained with cinematography knowledge to select in real-time the best scene shots in storytelling dramatization.
Our methodology is applicable not only to storytelling systems; it can be adapted to other entertainment applications, such as games, virtual worlds or 3D simulations. This approach ensures that most of the times the selected shots are the best solution to show the scene in accordance with cinematography principles and rules.
Figure 9. Prediction performance test with different training sets. Support vector machine is a powerful machine learning [10] D. Grasbon, N. In this paper, we have [11] S. Halper, R. Helbing, T. Training our support vector machines at real-time USA, Following this, medium shots may be used to introduce the main characters. A medium shot frames characters from the thighs up. After a number of medium shots, the camera can move in for close-ups where an individual character is depicted from the waist, or higher, to above the head.
As the characters move around the scene and after a certain time duration it may be necessary to use another establishing shot to allow the viewer to regain their bearings. As the camera moves closer in this way from establishing shots to close-ups, the views become more subjective and the audience tends to identify more with the characters.
As such, dramatic emphasis can be added to certain events and the emotional state of the characters portrayed. The greatest subjectivity is achieved with first-person perspective.
To counteract the possibility of the viewer becoming disorientated due to numerous close- ups in sequence, for example, film makers ensure that screen direction [1] is preserved. For example, if character A is to the left of character B at the end of a shot, these relative positions should be the same at the start of the next shot. Likewise, if a character is walking towards the right of the screen at the end of a shot, the camera should be positioned such that they are still walking to the right when the following shot begins.
A single camera or multiple cameras may be used to film a motion picture. A single camera is used where the action is planned in advance and so the scene is set up for each shot.
Where the action is unplanned or improvised multiple cameras are used to film multiple angles of the events. In both cases the resulting footage is edited afterwards. On a film set, it is director who chooses the events to film and the director of photography or cinematographer who decides the style of filming and coordinates the camera operators. Since the action during game play is unplanned, cinematography dictates that multiple cameras should be used.
With the synchronisation of multiple cameras, the view can switch between multiple camera angles thereby exploiting a fundamental feature of cinematography. More elaborate camera work is used in cut-scenes inserted at intervals throughout the game play in some games. However, these are non-interactive pre-rendered scenes: Our focus is on the camera work during the interactive game play. It is possible to give them specific instructions regarding their position and their movement in such a way that they mimic the behaviour of camera operators on a film set.
The game engine can then present the action from the viewpoint of any of these bots. In particular, bots are capable of taking up a position and looking in a certain direction; they can find a path through a virtual environment that avoids obstacles [13, 14]; they can crouch, turn, follow a subject, approach a subject and retract from a subject.
These are all qualities that are expected of a camera operator on a film set. Real world filming, however, also requires the use of special wheeled structures for moving cameras smoothly; cranes and purpose-built scaffolding are needed for elevated camera placement; special camera housing must be used for under water shots and so on [2].
With bots, these problems do not exist. Bots can move smoothly or in a jerky fashion as required; a bot can be any height, easily achieving so-called doggie-cam shots or crane shots; bots can fly and swim and so provide footage from otherwise difficult angles. A director module decides what subject matter is to be filmed, a cinematographer module decides how it is to be filmed and a number of CameraBots film it.
More specifically, the director examines the action occurring in the game world and decides what events are to be filmed. The cinematographer examines these events and the arrangement of characters and props in the setting in which they occur and selects a suitable model or idiom to employ in its filming. At this stage, the director can provide feedback if there are a number of candidate idioms. The cinematographer now ensures that the necessary CameraBots are in the game world and provides them with instructions such as the characters to film as indicated by the idiom.
Our modular approach means that new types of CameraBots can be added to our virtual cinematography system with ease. The CameraBots observe the continuity of screen direction as discussed in the Cinematography section. Real-time editing of the footage provided by the bots is performed by the cinematographer. What the player sees is a number of static and moving views from different angles suited to the events occurring.
By using the AI portion of a game engine, i. Presently, we are implementing our virtual cinematography system as a modification to the Quake II game engine. The code is written in a modular way to allow portability to other game engines. In the future we plan to add new types of CameraBots to our framework and increase the functionality of the existing ones to allow for a wider range of cinematic camera work.
Mascelli, J. Los Angeles: Silman-James Press. Brown, B. Oxford: Focal. Amerson, D. Bares, W. Task-sensitive cinematography interfaces for interactive 3D learning environments. In proceedings of the 3rd international conference on Intelligent user interfaces, pp. Christianson, D. Declarative Camera Control for Automatic Cinematography. Halper, N.
0コメント