DiPART (Diagram Part Labeling) dataset is a diagram dataset with part annotations with point supervision. The DiPART dataset is designed to evaluate algorithms on the task of One Shot Part Labeling in the diagram-to-diagram scenario.

This dataset explorer provides a visualization of the DiPART dataset. For visualization purposes, all images are resized to 400x400 with overlaid point annotations. The dataset consists of raw images with no overlays and with arbitrary sizes.

Simple Statistics

Total number of categories: 200
Total number of images: 4,921
Total number of parts: 49,210 (10x per image)
Number of pairs: 50,835 pairs in train split
                 10,631 pairs in val (validation) split
                 10,055 pairs in test split

Download Dataset

Download Link (850.8 MB)


License for Non-Commercial Use
If this software is redistributed, this license must be included. The term software includes any source files, documentation, executables, models, and data.
This software and data is available for general use by academic or non-profit, or government-sponsored researchers.  It may also be used for evaluation purposes elsewhere.  This license does not grant the right to use this software or any derivation of it in a for-profit enterprise.  For commercial use, please contact The Allen Institute for Artificial Intelligence.  
This license does not grant the right to modify and publicly release the data in any form.
This license does not grant the right to distribute the data to a third party in any form.
The subjects in this data should be treated with respect and dignity. This license only grants the right to publish short segments or still images in an academic publication where necessary to present examples, experimental results, or observations.    
This software comes with no warranty or guarantee of any kind.  By using this software, the user accepts full liability.

The Allen Institute for Artificial Intelligence (C) 2017.


1.0 (November 2017)


Jonghyun Choi (