TORNADO-Net: mulTiview tOtal vaRiatioN semAntic segmentation with Diamond inceptiOn module
M. Gerdzhev, R. Razani, E. Taghavi, and B. Liu
Abstract: Semantic segmentation of point clouds is a key component of scene understanding for robotics and autonomous driving. In this paper, we introduce TORNADO-Net – a neural network for 3D LiDAR point cloud semantic segmentation. We incorporate a multi-view (bird-eye and range) projection feature extraction with an encoder-decoder ResNet architecture with a novel diamond context block. Current projection-based methods do not take into account that neighboring points usually belong to the same class. To better utilize this local neighbourhood information and reduce noisy predictions, we introduce a combination of Total Variation, Lovasz-Softmax, and Weighted Cross-Entropy losses. We also take advantage of the fact that the LiDAR data encompasses 360 degrees field of view and uses circular padding. We demonstrate state-of-the-art results on the SemanticKITTI dataset and also provide thorough quantitative evaluations and ablation results.
M. Gerdzhev, R. Razani, E. Taghavi, and B. Liu, “Tornado-net: multiview total variation semantic segmentation with diamond inception module,” in 2021 IEEE International Conference on Robotics and Automation, 2021.
Lidar few-shot domain adap-
tation via integrated cyclegan and 3d object detector with joint learning delay
E. R. Corral-Soto, A. Nabatchian, M. Gerdzhev, and L. Bingbing
Abstract: The success of supervised LiDAR perception methods relies on the availability of large sets of labeled point cloud data, for which the labeling process is costly and time consuming. Given unpaired LiDAR datasets of similar sizes from two domains, with one (source) containing task-specific labels e.g. 3D bounding boxes for all frames, but only a small percentage of frames being labeled in the other (target)
domain, it is challenging to train a model that generalizes well on validation data from the target domain. In this paper we propose a novel LiDAR few-shot domain adaptation architecture and training strategy to address this challenge.
Our method is based on adapting a task-specific network (3D object detector) to work within the CycleGAN framework modified to operate with LiDAR features, and on the joint end-to-end training of generators, discriminators, and task-specific layers. To overcome nonconvergence issues we propose a training strategy that introduces a mechanism to delay the joint learning between the generators/discriminators and
the task-specific network by allowing them to start learning independently, while slowly introducing joint learning as they converge, hence avoiding instability during the early stages of the training. Our proposed integrated architecture enables a direct way to evaluate the performance of the model instead of feeding pre-computed generated data into a separate pre-trained model. We include an experimental section where we evaluate our proposed architecture on the publicly available
KITTI and Nuscenes datasets, as well as on our own labeled dataset. We present useful mean average precision plots that illustrate the benefits of our domain adaptation architecture as a function of number of labeled target domain frames.
E. R. Corral-Soto, A. Nabatchian, M. Gerdzhev, and L. Bingbing, “Lidar few-shot domain adaptation via integrated cyclegan and 3d object detector with joint learning delay,” in 2021 IEEE International Conference on Robotics and Automation, 2021.
On the Use of Modular Software and Hardware for Designing Wheelchair Robots
M. Gerdzhev, J. Pineau, I. M. Mitchell, P. Viswanathan, and G. Foley
Abstract: This short paper describes experiences in the development of several smart power wheelchair platforms across three different sites. In the course of the project, we have re-used several of the components (both hardware and software) despite differences in the base platform of the robots. We describe the different platforms, and discuss some of the challenges and results of our work.
Citation: M. Gerdzhev, J. Pineau, I. M. Mitchell, P. Viswanathan, and G. Foley, “On the use of modular software and hardware for designing wheelchair robots”, in 2016 AAAI Spring Symposium Series, 2016.
Usability and use of SLS: caption
D. Fels, M. Gerdzhev, J. Ho, E. Hibbard
Abstract: SLS:Caption provides captioning functionality for deaf and hearing users to provide captions to video content (including sign language content). Users are able to enter and modify text as well as adjust its font, colour, location and background opacity. An initial user study with hearing users showed that SLS:Caption was easy to learn and use. However, users seem reluctant to produce captions for their own video material; this was likely due to the task complexity and time required to create captions regardless of the usability of the captioning tool.
Citation: D. Fels, M. Gerdzhev, J. Ho, E. Hibbard, “Usability and use of SLS: caption”, in Proceedings of the 12th international ACM SIGACCESS conference on Computers and accessibility (ASSETS ’10). ACM, New York, NY, USA, 291-292, 2010.
DEX – A Design for Canine-Delivered Marsupial Robot
M. Gerdzhev, J. Tran, A. Ferworn, D. Ostrom
Abstract: This paper presents the work on Drop and EXplore (DEX), a small rescue robot to be used in Urban Search and Rescue (USAR) operations. Unlike other rescue robots, DEX was designed to be used in tandem with trained USAR canines. The development of DEX was part of a new concept called Canine Assisted Robot Deployment (CARD). CARD utilizes search canines to deliver robots close to the casualties trapped under rubble. A small robot is attached to a search dog. After the dog uses its agility and sense of smell to find a casualty, the robot is deployed when the dog gives its bark indication. This method circumvents the current problems of response robots, their inability to traverse rubble. As DEX was constructed in order to test the concept of CARD, its designs are described in this paper along with the experiments conducted.
Citation: M. Gerdzhev, J. Tran, A. Ferworn, D. Ostrom, “DEX – A Design for Canine-Delivered Marsupial Robot”, in 8th IEEE International Workshop on Safety, Security, and Rescue Robotics (SSRR-2010), Bremen, Germany, 2010.
Canine Assisted Robot Deployment for Urban Search and Rescue
J. Tran, A. Ferworn, M. Gerdzhev, D. Ostrom
Abstract: In Urban Search and Rescue (USAR) operations the search for survivors must occur before rescue operations can proceed. Two methods that can be used to search in rubble are trained search dogs and specialized response robots (sometimes called rescue robots). Rescue robots are used to collect information about trapped people within a disaster like a collapsed building. Information from them can help first responders plan and execute a rescue effort. The main challenge for these robots is the restrictions placed on their mobility by challenging rubble surfaces. While current research in this area attacks this challenge through mechanical design, good solutions remain elusive. This paper presents a new method for dispersing response robots called Canine Assisted Robot Deployment (CARD). CARD’s approach utilizes USAR dogs to deliver robots close to a trapped human detected by the dog. This method exploits the canine ability to find survivors using their olfactory sensors and agility. Once a dog carrying a small robot has found a casualty, the robot can be dropped and begin exploring. Initial experiments and results are described in this paper.
Citation: J. Tran, A. Ferworn, M. Gerdzhev, D. Ostrom, “Canine Assisted Robot Deployment for Urban Search and Rescue”, in 8th IEEE International Workshop on Safety, Security, and Rescue Robotics (SSRR-2010), Bremen, Germany, 2010.
A Scrubbing Technique for the Automatic Detection of Victims in Urban Search and Rescue Video
M. Gerdzhev, J. Tran, A. Ferworn, K. Barnum and M. Dolderman
Abstract: In the discipline of Urban Search and Rescue (US&R), the faster a live human can found the more likely their rescue will be successful with success being measured in lives saved. We have been working to augment trained US&R dogs with technology to help first responders in the US&R effort and give them a better understanding of the condition of the disaster area being searched and the trapped people who are found. As one can imagine, the video feed from a dog can be quite jittery. We have been exploring ways to speed the process of video ?scrubbing? by automatically discarding segments of video which show nothing interesting and concentrating on segments that are critical. This paper discusses one of these techniques.
Citation: M. Gerdzhev, J. Tran, A. Ferworn, K. Barnum and M. Dolderman, “A Scrubbing Technique for the Automatic Detection of Victims in Urban Search and Rescue Video”, in 6th International Wireless Communications and Mobile Computing Conference (IWCMC 2010), Caen, France, 2010.
Continuing Progress in Augmenting Urban Search and Rescue Dogs
J. Tran, M. Gerdzhev, and A. Ferworn
Abstract: Canine Augmentation Technology (CAT) is a telepresence system worn by search canines to be used in Urban Search and Rescue (US&R) operations. The intended purpose of CAT is as a tool for search teams and emergency managers to sense the situation when the dog finds a survivor in a collapsed structure. Data about the environment is transmitted to searchers and managers from the dog who may be able to penetrate further into a rubble pile than humans. Certain critical information can help the rescue team by allowing them to understand the situation around the victim before they actually attempt the rescue. This paper describes the latest developments in the CAT prototypes as well as discusses the improvements from previous versions and makes comparisons to other telepresence systems used in US&R operations.
Citation: J. Tran, M. Gerdzhev, and A. Ferworn, “Continuing Progress in Augmenting Urban Search and Rescue Dogs”, in 6th International Wireless Communications and Mobile Computing Conference (IWCMC 2010), Caen, France, 2010.
Sign language online with signlink studio 2.0
D. I. Fels, M. Gerdzhev, E. Hibbard, A. Goodrum, J. Richards, J. Hardman and N. Thompson
Abstract: Recent advances in web technologies and services have expanded the possibility of having a truly multimedia, interactive web experience for many users. It also means that video based content is not just possible but affordable. Online video-based sign language content is now much more prolific. However, most of the interactive functions remain text-based. We describe a tool, Signlink Studio, that allows hyperlinking functionality within video content, called signlinking, sign language-based form elements, and a sign language forum and content management system. We also present three methodologies specifically oriented towards creating sign language video materials for Signlink Studio or other video-based web applications. As a result of this focus on web functionality that is culturally and linguistically relevant, we suggest that a more equitable and inclusive web will evolve
Peer-to-peer simulation for improving the system of serving multimedia on the web
M. Bashardoust, M. Gerdzhev, and A. Abhari
Abstract: P2P networks are scalable in serving multimedia files over Internet which is in contrast to the centralized client-server system in which the server provides all the resources to the clients. In this research, we have designed and implemented a Peer-to-Peer network simulator to efficiently serve multimedia on the Internet considering that P2P streaming applications scale very superior even in large crowd scenarios. Simulated data was used to conduct experiment from current popular multimedia serving websites such as YouTube. We have then simulated user traffic by using the P2P system to compare the performance result with the traditional Client-Server system.
Citation: M. Bashardoust, M. Gerdzhev, and A. Abhari, “Peer-to-peer simulation for improving the system of serving multimedia on the web”, in Proceedings of the 2009 Spring Simulation Multiconference (SpringSim ’09), Society for Computer Simulation International, San Diego, CA, USA, 1-4, 2009.