Jump to content

User:Bazuz/sandbox/Simultaneous localization and mapping

From Wikipedia, the free encyclopedia
2005 DARPA Grand Challenge winner STANLEY performed SLAM as part of its autonomous driving system
A map generated by a SLAM Robot.

In navigation, robotic mapping and odometry for virtual reality or augmented reality, simultaneous localization and mapping (SLAM) is the computational problem of constructing or updating a map of an unknown environment while simultaneously keeping track of an agent's location within it.[1][2][3][4] While this initially appears to be a chicken-and-egg problem there are several algorithms known for solving it, at least approximately, in tractable time for certain environments. Popular approximate solution methods include the particle filter, extended Kalman filter, Covariance intersection, and GraphSLAM.

SLAM algorithms are tailored to the available resources, hence not aimed at perfection, but at operational compliance. Published approaches are employed in self-driving cars, unmanned aerial vehicles, autonomous underwater vehicles, planetary rovers, newer domestic robots and even inside the human body.[5]


History

[edit]

A seminal work in SLAM is the research of R.C. Smith and P. Cheeseman on the representation and estimation of spatial uncertainty in 1986.[6][7] Other pioneering work in this field was conducted by the research group of Hugh F. Durrant-Whyte in the early 1990s.[8] which showed that solutions to SLAM exist in the infinite data limit. This finding motivates the search for algorithms which are computationally tractable and approximate the solution.

The self-driving STANLEY and JUNIOR cars, led by Sebastian Thrun, won the DARPA Grand Challenge and came second in the DARPA Urban Challenge in the 2000s, and included SLAM systems, bringing SLAM to worldwide attention. Mass-market SLAM implementations can now be found in consumer robot vacuum cleaners.[9]

Biological analogue

[edit]

In neuroscience, the hippocampus appears to be involved in SLAM-like computations,[10][11][12] giving rise to place cells, and has formed the basis for bio-inspired SLAM systems such as RatSLAM.

Sensors

[edit]
Accumulated registered point cloud from lidar SLAM.

SLAM will always use several different types of sensors, and the powers and limits of various sensor types have been a major driver of new algorithms.[13] Statistical independence is the mandatory requirement to cope with metric bias and with noise in measurements. Different types of sensors give rise to different SLAM algorithms whose assumptions are most appropriate to the sensors. At one extreme, laser scans or visual features provide details of many points within an area, sometimes rendering SLAM inference is unnecessary because shapes in these point clouds can be easily and unambiguously aligned at each step via image registration. At the opposite extreme, tactile sensors are extremely sparse as they contain only information about points very close to the agent, so they require strong prior models to compensate in purely tactile SLAM. Most practical SLAM tasks fall somewhere between these visual and tactile extremes.

Sensor models divide broadly into landmark-based and raw-data approaches. Landmarks are uniquely identifiable objects in the world whose location can be estimated by a sensor—such as wifi access points or radio beacons. Raw-data approaches make no assumption that landmarks can be identified, and instead model directly as a function of the location.

Optical sensors

[edit]

Optical sensors may be one-dimensional (single beam) or 2D- (sweeping) laser rangefinders, 3D High Definition LiDAR, 3D Flash LIDAR, 2D or 3D sonar sensors and one or more 2D cameras.[13] Since 2005, there has been intense research into VSLAM (visual SLAM) using primarily visual (camera) sensors, because of the increasing ubiquity of cameras such as those in mobile devices.[14] Visual and LIDAR sensors are informative enough to allow for landmark extraction in many cases. Other recent forms of SLAM include tactile SLAM[15] (sensing by local touch only), radar SLAM,[16] acoustic SLAM,[17] and wifi-SLAM (sensing by strengths of nearby wifi access points).[18] Recent approaches apply quasi-optical wireless ranging for multi-lateration (RTLS) or multi-angulation in conjunction with SLAM as a tribute to erratic wireless measures. A kind of SLAM for human pedestrians uses a shoe mounted inertial measurement unit as the main sensor and relies on the fact that pedestrians are able to avoid walls to automatically build floor plans of buildings. by an indoor positioning system.[19]

For some outdoor applications, the need for SLAM has been almost entirely removed due to high precision differential GPS sensors. From a SLAM perspective, these may be viewed as location sensors whose likelihoods are so sharp that they completely dominate the inference. However GPS sensors may go down entirely or in performance on occasions, especially during times of military conflict which are of particular interest to some robotics applications.

Acoustic SLAM

[edit]

An extension of the common SLAM problem has been applied to the acoustic domain, where environments are represented by the three-dimensional (3D) position of sound sources, termed.[20] Early implementations of this technique have utilized Direction-of-Arrival (DoA) estimates of the sound source location, and rely on principal techniques of Sound localization to determine source locations. An observer, or robot must be equipped with a microphone array to enable use of Acoustic SLAM, so that DoA features are properly estimated. Acoustic SLAM has paved foundations for further studies in acoustic scene mapping, and can play an important role in human-robot interaction through speech. In order to map multiple, and occasionally intermittent sound sources, an Acoustic SLAM system utilizes foundations in Random Finite Set theory to handle the varying presence of acoustic landmarks.[21] However, the nature of acoustically derived features leaves Acoustic SLAM susceptible to problems of reverberation, inactivity, and noise within an environment.

Audio-Visual SLAM

[edit]

Originally designed for Human–robot interaction, Audio-Visual SLAM is a framework that provides the fusion of landmark features obtained from both the acoustic and visual modalities within an environment.[22] Human interaction is characterized by features perceived in not only the visual modality, but the acoustic modality as well; as such, SLAM algorithms for human-centered robots and machines must account for both sets of features. An Audio-Visual framework estimates and maps positions of human landmarks through use of visual features like human pose, and audio features like human speech, and fuses the beliefs for a more robust map of the environment. For applications in mobile robotics (ex. drones, service robots), it is valuable to use low-power, lightweight equipment such as monocular cameras, or microelectronic microphone arrays. Audio-Visual SLAM can also allow for complimentary function of such sensors, by compensating the narrow field-of-view, feature occlusions, and optical degradations common to lightweight visual sensors with the full field-of-view, and unobstructed feature representations inherent to audio sensors. The susceptibility of audio sensors to reverberation, sound source inactivity, and noise can also be accordingly compensated through fusion of landmark beliefs from the visual modality. Complimentary function between the audio and visual modalities in an environment can prove valuable for the creation of robotics and machines that fully interact with human speech and human movement.



Localization and mapping as separate problems

[edit]

For some outdoor applications, the need for SLAM has been almost entirely removed due to high precision differential GPS sensors. From a SLAM perspective, these may be viewed as location sensors whose likelihoods are so sharp that they completely dominate the inference. However GPS sensors may go down entirely or in performance on occasions, especially during times of military conflict which are of particular interest to some robotics applications.


The two main approaches

[edit]

Fitering

[edit]

Graph-based

[edit]

== The impact of deep learning == Important

Optional: mathematical formulation

[edit]

Given a series of controls and sensor observations over discrete time steps , the SLAM problem is to compute an estimate of the agent's location and a map of the environment . All quantities are usually probabilistic, so the objective is to compute:

Applying Bayes' rule gives a framework for sequentially updating the location posteriors, given a map and a transition function ,

Similarly the map can be updated sequentially by

Like many inference problems, the solutions to inferring the two variables together can be found, to a local optimum solution, by alternating updates of the two beliefs in a form of EM algorithm.


See also

[edit]

References

[edit]
  1. ^ Durrant-Whyte, H.; Bailey, T. (2006). "Simultaneous localization and mapping: part I". IEEE Robotics & Automation Magazine. 13 (2): 99–110. CiteSeerX 10.1.1.135.9810. doi:10.1109/mra.2006.1638022. ISSN 1070-9932.
  2. ^ Bailey, T.; Durrant-Whyte, H. (2006). "Simultaneous localization and mapping (SLAM): part II". IEEE Robotics & Automation Magazine. 13 (3): 108–117. doi:10.1109/mra.2006.1678144. ISSN 1070-9932.
  3. ^ Cadena, Cesar; Carlone, Luca; Carrillo, Henry; Latif, Yasir; Scaramuzza, Davide; Neira, Jose; Reid, Ian; Leonard, John J. (2016). "Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age". IEEE Transactions on Robotics. 32 (6): 1309–1332. arXiv:1606.05830. Bibcode:2016arXiv160605830C. doi:10.1109/tro.2016.2624754. hdl:2440/107554. ISSN 1552-3098.
  4. ^ Perera, Samunda; Barnes, Dr.Nick; Zelinsky, Dr.Alexander (2014), Ikeuchi, Katsushi (ed.), "Exploration: Simultaneous Localization and Mapping (SLAM)", Computer Vision: A Reference Guide, Springer US, pp. 268–275, doi:10.1007/978-0-387-31439-6_280, ISBN 9780387314396
  5. ^ Mountney, P.; et al. (Stoyanov, D.; Davison, A.; Yang, G-Z.) (2006). "Simultaneous Stereoscope Localization and Soft-Tissue Mapping for Minimal Invasive Surgery". MICCAI. Lecture Notes in Computer Science. 1 (Pt 1): 347–354. doi:10.1007/11866565_43. ISBN 978-3-540-44707-8. PMID 17354909. Retrieved 2010-07-30.
  6. ^ Smith, R.C.; Cheeseman, P. (1986). "On the Representation and Estimation of Spatial Uncertainty" (PDF). The International Journal of Robotics Research. 5 (4): 56–68. doi:10.1177/027836498600500404. Retrieved 2008-04-08.
  7. ^ Smith, R.C.; Self, M.; Cheeseman, P. (1986). "Estimating Uncertain Spatial Relationships in Robotics" (PDF). Proceedings of the Second Annual Conference on Uncertainty in Artificial Intelligence. UAI '86. University of Pennsylvania, Philadelphia, PA, USA: Elsevier. pp. 435–461. Archived from the original (PDF) on 2010-07-02. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
  8. ^ Leonard, J.J.; Durrant-whyte, H.F. (1991). "Simultaneous map building and localization for an autonomous mobile robot". Intelligent Robots and Systems' 91.'Intelligence for Mechanical Systems, Proceedings IROS'91. IEEE/RSJ International Workshop on: 1442–1447. doi:10.1109/IROS.1991.174711. ISBN 978-0-7803-0067-5. Retrieved 2008-04-08.
  9. ^ Knight, Will. "With a Roomba Capable of Navigation, iRobot Eyes Advanced Home Robots". MIT Technology Review. Retrieved 2018-04-25.
  10. ^ Howard, MW; Fotedar, MS; Datey, AV; Hasselmo, ME (2005). "The temporal context model in spatial navigation and relational learning: toward a common explanation of medial temporal lobe function across domains". Psychological Review. 112 (1). 2005, Psychol Rev. 112(1):75-116.: 75–116. doi:10.1037/0033-295X.112.1.75. PMC 1421376. PMID 15631589.
  11. ^ Fox, C; Prescott, T (2010). "Hippocampus as Unitary Coherent Particle Filter" (PDF). The 2010 International Joint Conference on Neural Networks (IJCNN). 2010, International Joint Conference on Neural Networks. pp. 1–8. doi:10.1109/IJCNN.2010.5596681. ISBN 978-1-4244-6916-1.
  12. ^ Milford, MJ; Wyeth, GF; Prasser, D. RatSLAM: a hippocampal model for simultaneous localization and mapping (PDF). Proceedings. ICRA'04. IEEE International Conference on. Vol. 1. IEEE, 2004.
  13. ^ a b Magnabosco, M.; Breckon, T.P. (February 2013). "Cross-Spectral Visual Simultaneous Localization And Mapping (SLAM) with Sensor Handover" (PDF). Robotics and Autonomous Systems. 63 (2): 195–208. doi:10.1016/j.robot.2012.09.023. Retrieved 5 November 2013.
  14. ^ Karlsson, N.; et al. (Di Bernardo, E.; Ostrowski, J; Goncalves, L.; Pirjanian, P.; Munich, M.) (2005). The vSLAM Algorithm for Robust Localization and Mapping. Int. Conf. on Robotics and Automation (ICRA).
  15. ^ Fox, C.; Evans, M.; Pearson, M.; Prescott, T. (2012). Tactile SLAM with a biomimetic whiskered robot (PDF). Proc. IEEE Int. Conf. on Robotics and Automation (ICRA).
  16. ^ Marck, J.W.; Mohamoud, A.; v.d. Houwen, E.; van Heijster, R. (2013). Indoor radar SLAM A radar application for vision and GPS denied environments (PDF). Radar Conference (EuRAD), 2013 European.
  17. ^ Evers, Christine, Alastair H. Moore, and Patrick A. Naylor. "Acoustic simultaneous localization and mapping (a-SLAM) of a moving microphone array and its surrounding speakers." 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016.
  18. ^ Ferris, Brian, Dieter Fox, and Neil D. Lawrence. "Wifi-slam using gaussian process latent variable models." IJCAI. Vol. 7. No. 1. 2007.
  19. ^ Robertson, P.; Angermann, M.; Krach, B. (2009). Simultaneous Localization and Mapping for Pedestrians using only Foot-Mounted Inertial Sensors (PDF). Ubicomp 2009. Orlando, Florida, USA: ACM. doi:10.1145/1620545.1620560. Archived from the original (PDF) on 2010-08-16.
  20. ^ Evers, Christine; Naylor, Patrick A. (September 2018). "Acoustic SLAM". IEEE/ACM Transactions on Audio, Speech, and Language Processing. 26 (9): 1484–1498. doi:10.1109/TASLP.2018.2828321. ISSN 2329-9290.
  21. ^ Mahler, R.P.S. (October 2003). "Multitarget bayes filtering via first-order multitarget moments". IEEE Transactions on Aerospace and Electronic Systems. 39 (4): 1152–1178. doi:10.1109/TAES.2003.1261119. ISSN 0018-9251.
  22. ^ Chau, Aaron; Sekiguchi, Kouhei; Nugraha, Aditya Arie; Yoshii, Kazuyoshi; Funakoshi, Kotaro (October 2019). "Audio-Visual SLAM towards Human Tracking and Human-Robot Interaction in Indoor Environments". 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). New Delhi, India: IEEE: 1–8. doi:10.1109/RO-MAN46459.2019.8956321. ISBN 978-1-7281-2622-7.
[edit]


Category:Robot navigation Category:Applied machine learning Category:Motion in computer vision