video retrieval, rag
Search for relevant videos.
9,967 technical terms and definitions
Search for relevant videos.
Remove camera shake.
Apply style consistently across frames.
Apply style to videos.
Video super-resolution increases spatial resolution leveraging temporal information.
Increase video resolution.
Swin transformer adapted to video.
Various transformer designs for video.
Learn from video-text pairs.
Video generation models create videos from text or images. Emerging field with models like Sora showing impressive results.
Encode viewing direction.
View factors quantify geometric configuration affecting radiative heat exchange between surfaces.
Create different perspectives of data.
Reference vs duplicate tensors.
Violin plots combine box plots with density estimation showing full distributions.
Adversarial regularization for semi-supervised.
Isolated Python environments.
Simulate entire fab operations.
Simulate entire process flow before manufacturing.
Virtual metrology predicts measurements from process data reducing physical measurement needs.
Predict measurements from tool sensor data.
Predict metrology results from tool sensor data without physical measurement.
Computationally screen compound libraries.
Optical positioning system.
Vision-language models (VLMs) understand both images and text. Use for image captioning, visual Q&A, diagram analysis.
State space model for vision.
Apply SSM to visual tasks.
Different ViT architectures.
Scaling ViT to billions of parameters and massive datasets.
Navigate environments using language instructions.
Generate text describing images.
Deep integration of vision and language understanding.
Models trained to understand both images and text together.
Plan actions using vision and language.
Training tasks for VL models.
Tasks for learning multimodal representations.
Models that perceive and act.
Understand why things happen in images.
Apply commonsense to visual scenes.
Visual controls make status abnormalities and standards immediately apparent to anyone.
Determine if image entails text.
Determine if image entails text.
Localize objects from language.
Train models to follow visual instructions.
Visual management makes status standards and abnormalities immediately obvious.
Navigate using visual input.
Estimate camera motion from video.
Visual prompting includes images or diagrams to supplement textual instructions.
Use visual elements as prompts.
Answer questions about image content.