This paper discusses an approach to the problem of annotating multimedia content. Our approach provides annotation as metadata for indexing, retrieval and semantic processing as well as content enrichment. We use an underlying model for structured multimedia descriptions and annotations, allowing the establishment of spatial, temporal and linking relationships. We discuss aspects related with documents and annotations used to guide the design of an application that allows annotations to be made with pen-based interaction with Tablet PCs. As a result, a video stream can be annotated at the same time that it is captured. Moreover, the annotation can be edited, extended or played back synchronously afterwards.