June 2024
Our FF25 dataset contains 8,703 portrait images of 25 public figures and the corresponding text descriptions. All the images were crawled from publicly available sources on the Web. These 25 subjects include politicians, movie stars, writers, athletes and businessmen, with diverse genders, races, and career domains. As shown in Figure 11, the dataset contains 400-1,300 images of each subject.
Each raw image is then center-cropped to a resolution of 512×512. For each image, we use a pre-trained BLIP2 image captioning model to generate the corresponding text description, and prompt BLIP2 with the input of “a photo of <person_name> which shows”
to avoid hallucination.
The file structure is as follows:
test/<person_name>/img_<img_number>.png
test/metadata.csv
train/<person_name>/img_<img_number>.png
train/metadata.csv
For details, please refer to our related source code repository and paper.
January 2024
The Acoustic Waveform Respiratory Evaluation (AWARE) dataset consists of a group of human airway measurements, produced by our integrated AI and sensing systems for smart pulmonary telemedicine.
This dataset contains airway measurements of 382 human subjects, including patients with various pulmonary diseases and healthy control subjects, recruited from the Children’s Hospital of Pittsburgh during the past 3 years. The contents of the dataset include raw WAV files from acoustic sensing, segmented and aligned acoustic signal pulses, and processed measurements of airway cross-sectional areas.
To our best knowledge, this is the first public dataset of human airway measurements with pulmonary diseases, and we welcome any feedback from the smart health research community.
January 2024
This dataset is used for multimodal question-answering tasks in autonomous driving scenarios. We created this dataset based on nuScenes-QA dataset for evaluation in our paper Modality Plug-and-Play: Elastic Modality Adaptation in Multimodal LLMs for Embodied AI. The dataset is stored on HuggingFace.
This dataset is built on the nuScenes mini-split, where we obtain the QA pairs from the original nuScenes-QA dataset. Each data sample contains 6-view RGB camera captures, a 5D LiDAR point cloud, and a corresponding text QA pair. The data in the nuScenes-QA dataset is collected from driving scenes in cities of Boston and Singapore with diverse locations, time, and weather conditions.
The samples are divided into day and night scenes:
Scene | # train samples | # validation samples |
---|---|---|
day | 2,229 | 2,229 |
night | 659 | 659 |
Each sample contains:
In this dataset, the questions are generally difficult, and may require multiple hops of reasoning over the RGB and LiDAR data. For example, to answer the sample question in the above figure, the ML model needs to first identify in which direction the “construction vehicle” appears, and then counts the number of “parked trucks” in that direction. In our evaluations, we further cast the question-answering (QA) as an open-ended text generation task. This is more challenging than the evaluation setup in the original nuScenes-QA paper, where an answer set is predefined and the QA task is a classification task over this predefined answer set.
In most RGB images in the nuScenes dataset, as shown in the above figure - Left, the lighting conditions in night scenes are still abundant (e.g., with street lights), and we hence further reduce the brightness of RGB captures in night scenes by 80% and apply Gaussian blur with a radius of 7, as shown in the above figure - Right. By applying such preprocessing to the RGB views in night scenes, we obtain the training and validation splits of night scenes with 659 samples for each split. On the other hand, the RGB views in daytime scenes remain as the origin. The day split contains 2,229 for training and 2,229 for validation respectively.
With internet connection, you may load this dataset directly using HuggingFace Datasets library:
from datasets import load_dataset
# load train split in day scene
day_train = load_dataset("KevinNotSmile/nuscenes-qa-mini", "day", split="train")
December 2023
This dataset comes from our mobile GPU-based eavesdropping work, Eavesdropping user credentials via GPU side channels on smartphones, presented at the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2022). It contains 3,466 traces of mapping between the on-screen keyboard key presses and corresponding Snapdragon Adreno GPU performance counter changes collected on device in the meantime.As shown above, the raw mobile GPU Performance Counter (“PC”) value changes when screen display changes, including key board pop-up events, key disappearance events, and all other miscellaneous graphics changes. All GPU PC changes are recorded in the raw traces.
The dataset is arranged in the following format:
1622457056
): This UNIX timestamp when the experiment took place.timestamp_data.csv
: Raw recording of GPU performance counter changes during the experiment.PERF_LRZ_VISIBLE_PRIM_AFTER_LRZ
PERF_LRZ_FULL_8X8_TILES
PERF_LRZ_PARTIAL_8X8_TILES
PERF_LRZ_VISIBLE_PIXEL_AFTER_LRZ
PERF_RAS_SUPERTILE_ACTIVE_CYCLES
PERF_RAS_SUPER_TILES
PERF_RAS_8X4_TILES
PERF_RAS_FULLY_COVERED_8X4_TILES
PERF_VPC_PC_PRIMITIVES
PERF_VPC_SP_COMPONENTS
PERF_VPC_LRZ_ASSIGN_PRIMITIVES
PERF_VPC_SP_LM_COMPONENTS
timestamp_keys.csv
: Keyboard key presses occurred during the experiment.For the discussion of detailed meanings of different GPU PCs, please refer to Section 4 of our paper.