The presentation begins with descriptions of activities which may be enhanced by networked media services, and then moves on to describe the basic and specific services that may be of use in more detail. These descriptions are followed by a short discussion of a simplified approach to strategic planning.
For example, audio/video conferencing can bring teachers and students together
across vast distances, audio/video demonstrations are often used to enhance
lectures, and lectures may easily be recorded for viewing at a later date.
The biggest problem facing educators and technologists is not whether
to use technology but which technologies, how, when and in what combinations.
Each of the activities above may derive some benefits from the application
of various technologies, but participants must determine which tools to
employ and in what combinations and modes.
The next sections will describe the very basic audio/visual tools available to
educators and then discuss combinations of these basic tools and finally
specific implementations of such combinations. Note that it will be assumed
that non-AV tools will be integrated with these video tools as needed.
For example, the Web, middleware facilities, desktop applications, phones
and faxes may all be used in concert with AV tools, but will be
treated as ancillary technologies in this discussion.
For example, here is one sequence of events that can occur while moving a
video signal from one point to another. To begin with, a scene is
shot using a video camera which converts it to an S-Video signal stream
that is converted to digital form (or "captured") in an H.261 storage format
(among many others), conveyed across a network embedded in some transport
protocol such as RTP
(the Realtime Transport Protocol), and converted back
into S-Video at the receiving end where it can be
displayed by a standard video projector. The process of conversion to
some storage format (as H.261) usually involves some loss of image quality
to accomodate limited disk storage and/or limited bandwidth during
transport.
In general, such processes provide renditions of the original sounds and
images with varying degrees of verisimilitude, which correspond to
variations in product "quality."
For example, an NTSC camera scans a video image as 525 separate
scan lines, which can be captured at relatively low or high video
"resolutions," usually specified as a rectangular collection of picture
elements, or "pixels". The NTSC image is considered to have
a quality roughly equivalent to 680 by 483, whereas one popular
video conferencing format (H.261) converts NTSC images
into 352 by 288 (SDTV Common Image Format) or 176 by 144 (QCIF) collection
of pixels.
MPEG-1 streams typically encode CIF images considered similar to the
quality yielded by VHS recording technology, but can produce an image
of up to 4095 by 4095 pixels.
MPEG-2 is typically used to produce 720 by 576 images, but can be used to
create 1920 by 1080 images when fed a High Definition input video signal
(2200 horizontal samples by 1125 scan lines).
H.261 used for video conferencing typically generates 64kbps to 1.5Mbps
traffic.
MPEG-1 streams use up to 1.5Mbps.
MPEG-2 is used by cable and satellite systems, generating 4 to 9Mbps for
SDTV and 19.2Mbps for HDTV over cable, and for DVDs stored at 720 by 576.
In addition, image capture can occur at differing frame rates (3 or 4 frames
per second up to 32 or so), and frames are sometimes lost during transport,
all of which can significantly reduce video quality.
Ancillary services
To help users make contact with one another, a variety of whitepages schemes
have been developed. For example, the H.323 suite includes directory
capabilities that enable users to "call" one another by clicking through
a contact list.
To help users find each other in a mobile environment with multiple
communications devices under their control, the Session Initiation
Protocol (SIP) has been designed to keep track of potential endpoint devices
and, along with the H.350 LDAP schema, even facilitate device configuration
and password management. In effect SIP can act like a standardized
"buddy list," and at least one observer (Larry Amiott of Northwestern) has
suggested that managing users' availability for communication may itself
be a "killer app," even without managing communication devices.
Others have suggested that truly effective SIP-like services could
finally propel both VoIP and video conferencing into common use.
Instant messaging offers both an alternative channel for communications
while attending a video conference, and also a "back channel" for dealing
with technical difficulties. It can also assist video conferees during
cross cultural and/or multilingual exchanges.
In addition, several groups are trying to develop "middleware" products that
will support user authentication that can be useful in many video applications.
The Internet2 MiddleWare Initiative is the most relevant to Kansas
universities, but other initiatives will be necessary for different types
of organizations.
Authentication and security requirements for video services appear to be
somewhat different from other applications, mostly due to the (possibly
multi-party) real-time nature of video and VoIP.
Other ancillary services include Whiteboards, desktop sharing, as well as
shared virtual- or mediated-reality applications.
Furthermore, different combinations of these basic tools
will be integrated to different degrees. Some products will include
all of these tools as a collection of completely free-standing desktop
applications, for example, while others (such as
Click To Meet
Express may integrate them as a single
application, providing a (usually) easier-to-use interface that readily
exploits synergies resulting from the combination of tools. Such products
are sometimes referred to as "rich presence" solutions, though there are
certainly degrees of "richness" and different combinations of tools.
Another recent interpretation of integrated communications tools was
demonstrated at a recent Internet2 Joint Techs conference which included
"SIP-based voice, video, and
instant messaging over wireless fidelity (WiFi), and SIP voice conferencing
- all in the context of rich presence derived from WiFi location service and
enterprise calendaring."
Participants were able to place SIP voice calls to any
user at a SIP.edu-enabled institution (http://voip.internet2.edu/SIP.edu/)
and were able to eavesdrop on meeting sessions by calling special "room
buddies."
This demonstration involved contributions from the
Internet Real-Time Lab
(IRT) in the Department of Computer Science at Columbia University,
the Internet2 Presence and Integrated
Communications Working Group, and several other groups.
For more information on SIP see this.
The use of multiple devices in a dynamic environment like a conference
venue also illustrates the need for databases listing devices assigned to
each user, as well as their operating characteristics and configurations,
including authentication information and preferences.
The ViDe Consortium in cooperation with the Southern Universities
Regional Association (SURA) has recently spearheaded the development of
the H.350 LDAP schema which provides a standardized storage format for
such data. Used with SIP, H.350 should make it possible for device
manufacturers to design devices that will interoperate seamlessly in
demanding mobile environments. For more information on H.350 see
this.
These issues hint at the need for centralized organizational support for video
conferencing. Users can get good mileage out of simple video conferencing
tools such as NetMeeting and MicroSoft Messenger, but campus administration
must consider
The ClickToMeet Express server also demonstrates a different kind of
integration. ClickToMeet Express is a web-based service that allows
(Wintel) users to simply start their desktop browsers and connect to
a ClickToMeet server which downloads the video conferencing client to
their browsers in the form of a web "plug-in". No software beyond
the browser is required to use this tool.
Tool integration is more readily apparent on the desktop side of AV,
it occurs on the server side as well. For example, there
exist tools for conducting video conferences that record designated
portions of a conference (or the entire exchange), automatically archive it,
and then stream it to users on demand after the event, as well.
This kind of capability could be used to provide review materials to
students who attend such a conference, or to "include" students who miss
the original.
Approaches to integration constitute much of the thrust of research and
innovation in the area of audio-video product development, deployment and use.
They also tend to inject some confusion into the choice of tools, as users
are tempted to apply tools specialized for one purpose to another, or
as purchasers attempt to decide among competing products with ill-defined
sets of goals or expectations.
One way this plays out in the video conferencing arena is that video
conferencing products are sometimes designed around collaboration groups
of particular sizes, distribution, and technical ability of the users.
For example, most desktop tools are designed for face-to-face meetings.
On the other hand, one complex system, the
the Access Grid, developed at Argonne National Laboratories, was explicitly
designed to bring several (up to 10 or so) small groups together in an
environment minimally disruptive to the participant experience, but
amply supported by technical staff who could handle the details.
Video streaming is used most frequently to display events of interest
such as athletic events or lectures that have a relatively short duration.
However, streaming can also be used to drive "video signage," when
a continuous stream is used to feed one or more video displays located
in high-traffic areas where they are encountered by numerous passers-by.
This combination of basic functionalities provides an opportunity to
develop a variety of asynchronous services, particularly systems for
embedding video content within text documents, but also including
various kinds of navigational aids, associated descriptions of content
("metadata"), synchronized close-captioning, scene analysis, etc.
Note that video streaming and archiving may require ancillary
services as part of the delivery infrastructure.
The distribution of commercial video content over IP connections, for
example, must usually include tools for discovery, authentication,
access control and accounting.
Also some streaming systems might employ computers to display content,
while others allow users to view content on standard televisions by
using a set-top-box to convert incoming signals to a form suitable for
televisions.
Evaluating the alternatives and choosing among them can be a daunting task.
Examples of VoD site delivering education content would include Georgia
Public Television (900 hours of educational content recorded at 384Kbps)
and ResearchTV (with 2000 hours of MPEG-2 content).
There is at least one OpenSource initiative in this area. The
SURF/net Video Portal
system supports a unified video database serving multiple
streaming servers which, in turn, serve user client requests.
As a more esoteric example, there exist tools for converting facial
images into a shorthand representation of facial gesture.
Streams of facial shorthand can then be shipped across
a network and used to drive an avatar image presented to collaborators.
Such an approach could allow video conferencing at much lower bandwidths
than normally required for an effective exchange.
Image analysis also finds use in data reduction, surveillance,
navigation and remote instrumentation applications.
There have been numerous attempts to develop database systems
to simplify retrieval of stored video content.
The KU Digital Jayhawk employs an innovative, automated indexing system that
receives audio/visual information in the form of the daily KUJH televised
newscast, breaks the AV stream into parts using scene analysis as described
above, stores each portion on a web site for later use, and indexes each part
using words appearing in text broadcast as a close-captioned stream
or displayed in the teleprompters read by on-screen announcers.
Nationwide there are several projects underway that aim to provide cataloged
and/or indexed video materials for networked distribution: Georgia
Public Television, Wisconsin Public Television, KCPT's Chalkwaves, etc.
This approach appears to be replacing satellite as the preferable method
of video delivery. Georgia has T1 or fiber to every K-12 in the state, over
which they hope to deliver targeted video material. Such customization
would be prohibitively expensive over satellite.
There are also several standards for cataloging video resources: MPEG-7,
MARC, Dublin Core, MODS, LOM, etc. though most video collections seem
to be using modified versions of "standard" approaches.
At least one "meta-collection" or collection of collections exists; the
In addition, there are some instances where video and animation streams
might be merged to produce an hybrid result, as with combining a real-time
video "head shot" with an animated or goniometer-driven avatar body, or
projecting directions or descriptions onto video images in real time,
so-called "augmented reality".
For example, some groups are experimenting with "eyeglasses" that project
video streams captured by a camera mounted on the eyeglass frame pointed
at the scene that would normally be seen by the unaided eye.
Overlays show directions to a destination site, captions on objects
in the scene, and (woe is me) advertisements. Other applications include
virtual furniture or laboratory instruments, and internal component views
of bodies, machines, geologic formations, buildings, etc. For example,
X-ray and/or ultrasound images of internal organs could be laid over
real-time patient views.
Other work has projected costumes and/or additional appendages onto a
stick-figure model of a subject put together from video cature images.
This approach has already been used for dramatic effect in theatrical
performances demonstrated by Internet2 member institutions.
Using digital formats and media for such purposes continues to be debated.
Some formats, such as MPEG-2 have been well standardized and commoditized
to the point where they may be able to resist various ravages of time and
therefore be safely used for long-term archiving.
Not all commentators agree however.
For a dissenting (or at least cautionary) opinion see
this article by VidiPax who focuses on the need to make high-quality
digital versions of analog materials. Also, even if the digitized form
of a video record survives, it is unclear whether software required to read
and display the digitized format will be available (although it seems highly
likely at this point in time that such software would be reconstructable
if not continuously available).
The following list of resources required to provide services
listed above has been prepared to give readers a general sense of the
requirements:
Adjustments must be made to the networking infrastructure to deliver
high volume traffic protected from congestion, and video streams must
probably be accessible through commodity display systems (such as standard
television sets controlled by hand-held remotes).
Using commodity display equipment requires the acquisition of Set Top
Boxes (STBs), which can convert IP streams to TV-ready signals and
assist with access control.
STBs can add expense, but also add capabilities that may be of interest
to users. In particular, they may be used to pause programs during
viewing, record programs for later viewing, and access the web as simple
web browser.
Some activities, such as video conferencing, will probably be used widely
enough and have been commoditized to the point that they are suitable
for federation.
The role of central IT groups in supporting such distributed services could be
limited to
consulting about network provisioning and equipment acquisition,
basic training in the use of conferencing equipment, and occasional
emergency assistance.
Other services, such as streaming, could probably better be provided by
by servers operated by centralized IT groups in protected environments
with high speed network connectivity, high reliability and secure,
settings.
Keep in mind that there will always be some customers who would rather
hire outside providers or central IT groups to provide video conferencing
services straight away, and that
some departments will be (and already are) equipped to operate their own
stream servers.
Presumably there will always be a varied mix of customer needs.
Also, there will always be reasons to implement portions of services
within different groups. For example, streaming video might be captured by
camera operators and technicians in any campus department, but streamed
from servers managed and operated by central IT services .
And, of course, other partitions can be easily imagined.
Possible audiences include
Audio/video services can be of value to many members of the broad educational
community, as well as to participants outside that community who
wish to interact with individuals or groups within that community:
Activities to be facilitated and/or enhanced:
Technology, in general, and audio/video technology, in particular, can play,
and has played, a role in the following general activities related to
the educational mission:
These activities are simple rubrics for collections of component
activities that vary greatly from environment to environment and
practitioner to practitioner.
Often, however, they entail the same or similar more specific
component activities.
For example, "teaching" and "learning" usually include a
"collaboration" component along with various data and lecture material
presentation components, so that it may be more useful to categorize
community activities in the following manner:
Services to be provided
Audio/video technology can make three basic contributions to the participant
experience in these activities.
First, it can enable participation across distance. Second it
can enhance participant experience by adding multiple supportive
communications media, and third, technology can enable "asynchronous"
interaction.
Basic audio- and video-related capabilities:
Although there exists a plethora of products designed to facilitate
the activities listed earlier, these products all represent different
combinations of a few basic capabilities expressed in slightly different
ways and with differing degrees of quality.
Combinations of basic technologies
Some of these basic services may be requested and delivered as is.
For example, some groups (such as commercial sports or news networks)
affiliated with campus offices may wish to transport analog audio/video
signals over the campus fiber infrastructure without any other processing.
In addition, some legacy video equipment still relies on such transport.
Video conferencing
Certain kinds of integration probably require central support
because they require special hardware, centralized databases, and or other
facilities likely to be used by a large proportion of the user community.
For example,
probably require some sort of centralized organizational support.
Video streaming
Video archiving and Video on Demand (VoD)
Image analysis
Cataloging and Indexing
Integrating video materials with other documents
Animation and hybrid video/animation
(Long-term) archival storage
Activities and equipment required to deliver A/V services:
Specialized equipment, software, and network connectivity are
required to provide these services. Specialized event production,
computing, logistics and operations skills are required to
provide a complete suite of video services, though there is considerable
overlap among services, which makes it important to match services and
resources so as to waste neither.
Video conferencing can require:
Video streaming can require:
Video capture can require:
Video transport can require:
Video archiving and Video on Demand can require:
Commercial video delivery
Commercial video will involve the same kinds of staffing and equipment
issues involved in video archiving and VoD, but overall resource requirements
are more demanding for several reasons. First, commercial video must be
delivered with relatively high quality. Second, video streams
must be carefully restricted to authorized users and usage accounting
is probably required. This requires access control arrangements not
necessarily required for serving non-commercial content.
Alternative organizational structures for A/V service delivery
The video and video-related services listed above may be provided via a
number of organizational approaches. For example, they may be
centralized under the control of a single organization entity or they
may be federated into organization departments and/or individual
groups within departments.
Outline of a strategic approach
Establishing a long-term strategy for audio/video services
will entail a number of steps. What follows is an outline of
activities required to generate a broad plan.