Label-Efficient Learning on Point Clouds using Approximate Convex Decompositions Gadelha, Matheus, RoyChowdhury, Aruni, Sharma, Gopal, Kalogerakis, Evangelos, Cao, Liangliang, Learned-Miller, Erik, Wang, Rui, and Maji, Subhransu 2020
Label-Efficient Learning on Point Clouds using Approximate Convex Decompositions
ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds Sharma, Gopal, Liu, Difan, Kalogerakis, Evangelos, Maji, Subhransu, Chaudhuri, Siddhartha, and Měch, Radomír ArXiv 2020
ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds
Search-Guided, Lightly-supervised Training of Structured Prediction Energy Networks Amirmohammad, Rooshenas, Dongxu, Zhang, Gopal, Sharma, and Andrew, McCallum NeurIPS 2019
Search-Guided, Lightly-supervised Training of Structured Prediction Energy Networks
Neural Shape Parsers for Constructive Solid Geometry Sharma, Gopal, Goyal, Rishabh, Liu, Difan, Kalogerakis, Evangelos, and Maji, Subhransu ArXiv 2019
Constructive Solid Geometry (CSG) is a geometric modeling technique that defines complex shapes by recursively applying boolean operations on primitives such as spheres and cylinders. We present CSGNet, a deep network architecture that takes as input a 2D or 3D shape and outputs a CSG program that models it. Parsing shapes into CSG programs is desirable as it yields a compact and interpretable generative model. However, the task is challenging since the space of primitives and their combinations can be prohibitively large. CSGNet uses a convolutional encoder and recurrent decoder based on deep networks to map shapes to modeling instructions in a feed-forward manner and is significantly faster than bottom-up approaches. We investigate two architectures for this task — a vanilla encoder (CNN) - decoder (RNN) and another architecture that augments the encoder with an explicit memory module based on the program execution stack. The stack augmentation improves the reconstruction quality of the generated shape and learning efficiency. Our approach is also more effective as a shape primitive detector compared to a state-of-the-art object detector. Finally, we demonstrate CSGNet can be trained on novel datasets without program annotations through policy gradient techniques.
Learning Point Embeddings from Shape Repositories for Few-Shot Segmentation Sharma, Gopal, Kalogerakis, Evangelos, and Maji, Subhransu 3DV 2019
User generated 3D shapes in online repositories contain rich information about surfaces, primitives, and their geometric relations, often arranged in a hierarchy. We present a framework for learning representations of 3D shapes that reflect the information present in this meta data and show that it leads to improved generalization for semantic segmentation tasks. Our approach is a point embedding network that generates a vectorial representation of the 3D points such that it reflects the grouping hierarchy and tag data. The main challenge is that the data is noisy and highly variable. To this end, we present a tree-aware metric-learning approach and demonstrate that such learned embeddings offer excellent transfer to semantic segmentation tasks, especially when training data is limited. Our approach reduces the relative error by 10.2% with 8 training examples, by 11.72% with 120 training examples on the ShapeNet semantic segmentation benchmark, in comparison to the network trained from scratch. By utilizing tag data the relative error is reduced by 12.8% with 8 training examples, in comparison to the network trained from scratch. These improvements come at no additional labeling cost as the meta data is freely available.
CSGNet: Neural Shape Parser for Constructive Solid Geometry Sharma, Gopal, Goyal, Rishabh, Liu, Difan, Kalogerakis, Evangelos, and Maji, Subhransu CVPR 2018
We present a neural architecture that takes as input a 2D or 3D shape and outputs a program that generates the shape. The instructions in our program are based on constructive solid geometry principles, i.e., a set of boolean operations on shape primitives defined recursively. Bottom-up techniques for this shape parsing task rely on primitive detection and are inherently slow since the search space over possible primitive combinations is large. In contrast, our model uses a recurrent neural network that parses the input shape in a top-down manner, which is significantly faster and yields a compact and easy-to-interpret sequence of modeling instructions. Our model is also more effective as a shape detector compared to existing state-of-the-art detection techniques. We finally demonstrate that our network can be trained on novel datasets without ground-truth program annotations through policy gradient techniques.
Persistent Aerial Tracking system for UAVs Mueller, Matthias, Sharma, Gopal, Smith, Neil, and Ghanem, Bernard In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2016
The ability to capture stabilized high resolution video from low-cost UAVs has the potential to significantly redefine future objectives in the development of state-of-the-art object tracking methods. In this paper, we propose a persistent, robust and autonomous object tracking system designed for UAV applications, called Persistent Aerial Tracking (PAT) (see Fig. 1). Persistent aerial tracking can serve many purposes, not only related to surveillance but also search and rescue, wild-life monitoring, crowd monitoring/management, and extreme sports. Deploying PAT on UAVs is a very promising application, since the camera can follow the target based on its visual feedback and actively change its orientation and position to optimize for tracking performance (e.g. persistent tracking accuracy in the presence of occlusion or fast motion across large and diverse areas). This is the defining difference with static tracking systems, which passively analyze a dynamic scene to produce analytics for other systems. It enables ad-hoc and low-cost surveillance that can be quickly deployed, especially in locales where surveillance infrastructure is not already established or feasible (e.g. remote locations, rugged terrain, and large water bodies.