[ Evaluación del marco de trabajo Hadoop y Power View en la Visualización de Trayectorias GPS Vehicular ]
Volume 16, Issue 2, June 2016, Pages 378–389
Gary Reyes Zambrano1, José Alvarado Santos2, Katia Villafuerte Ponce3, Oscar Leon de La Torre4, Fernando Coral Moran5, and Vicente Arreaga Figueroa6
1 Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Guayaquil, Ecuador
2 Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Guayaquil, Ecuador
3 Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Guayaquil, Ecuador
4 Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Guayaquil, Ecuador
5 Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Guayaquil, Ecuador
6 Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Guayaquil, Ecuador
Original language: Spanish
Copyright © 2016 ISSR Journals. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article describes the evaluation of work Hadoop framework and complement Excel Power View through an experiment analyzing large volumes of information from GPS vehicle trajectories. In order to do a study to use Hadoop's own tools, USA dataset with information used trucks and their respective routes. This research was conducted in the following stages: 1) selection work environment where we see what are the best features and the need to work with Hadoop, 2) hardware to setup the environment and features for the analysis of GPS, 3) paths loading, analysis and visualization of results. Using Hive it is studied as a data store and the transformation of the tables to a format that facilitates ORC information processing. At the stage of data analysis it was used to perform MapReduce algorithms and PIG to make a risk assessment using SQL code conversions. Lastly displays and interprets the results with Power View a feature of Microsoft Excel 2013, which shows a map with GPS coordinates for all vehicles, where analysis techniques could conclude that 40% of accidents on the roads of EE California USA is caused by driver fatigue. For future work will proceed to generate GPS paths of the city of Guayaquil to determine patterns in their behavior.
Author Keywords: Business, Intelligent, Hive, MapReduce, SQL.
Volume 16, Issue 2, June 2016, Pages 378–389
Gary Reyes Zambrano1, José Alvarado Santos2, Katia Villafuerte Ponce3, Oscar Leon de La Torre4, Fernando Coral Moran5, and Vicente Arreaga Figueroa6
1 Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Guayaquil, Ecuador
2 Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Guayaquil, Ecuador
3 Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Guayaquil, Ecuador
4 Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Guayaquil, Ecuador
5 Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Guayaquil, Ecuador
6 Facultad de Ciencias Matemáticas y Físicas, Universidad de Guayaquil, Guayaquil, Ecuador
Original language: Spanish
Copyright © 2016 ISSR Journals. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
This article describes the evaluation of work Hadoop framework and complement Excel Power View through an experiment analyzing large volumes of information from GPS vehicle trajectories. In order to do a study to use Hadoop's own tools, USA dataset with information used trucks and their respective routes. This research was conducted in the following stages: 1) selection work environment where we see what are the best features and the need to work with Hadoop, 2) hardware to setup the environment and features for the analysis of GPS, 3) paths loading, analysis and visualization of results. Using Hive it is studied as a data store and the transformation of the tables to a format that facilitates ORC information processing. At the stage of data analysis it was used to perform MapReduce algorithms and PIG to make a risk assessment using SQL code conversions. Lastly displays and interprets the results with Power View a feature of Microsoft Excel 2013, which shows a map with GPS coordinates for all vehicles, where analysis techniques could conclude that 40% of accidents on the roads of EE California USA is caused by driver fatigue. For future work will proceed to generate GPS paths of the city of Guayaquil to determine patterns in their behavior.
Author Keywords: Business, Intelligent, Hive, MapReduce, SQL.
Abstract: (spanish)
El presente artículo describe la evaluación del marco de trabajo Hadoop y del complemento Power View de Excel a través de un experimento de análisis de gran volumen de información de trayectorias GPS vehiculares. Con la finalidad de hacer un estudio que permita utilizar las herramientas propias de Hadoop, se utiliza un Dataset de EEUU con información de camiones y sus rutas respectivas. Esta investigación se desarrolló siguiendo las siguientes fases: 1) selección del ambiente de trabajo donde vemos cuales son las características óptimas y el hardware necesario para trabajar con Hadoop, 2) realizar la configuración del ambiente y características para el análisis de trayectorias GPS, 3) la carga, análisis y visualización de resultados. Se estudia el uso de Hive como almacén de datos y para la transformación de las tablas a un formato ORC que facilita el procesamiento de la información. En la etapa de análisis de Datos se usó MapReduce para realizar algoritmos y PIG para hacer un estudio de riesgos mediante conversiones de código SQL. Por último se visualiza e interpreta los resultados con Power View una característica de Microsoft Excel 2013, que muestra un mapa con todas las coordenadas GPS de los vehículos, donde mediante técnicas de análisis pudimos concluir que el 40% de los accidentes en las carreteras de California EE UU se ocasiona por la fatiga de los conductores. Para futuros trabajos se procederá a generar trayectorias GPS de la ciudad de Guayaquil para determinar patrones en su comportamiento.
Author Keywords: Business, Intelligent, Hive, MapReduce, SQL.
How to Cite this Article
Gary Reyes Zambrano, José Alvarado Santos, Katia Villafuerte Ponce, Oscar Leon de La Torre, Fernando Coral Moran, and Vicente Arreaga Figueroa, “Evaluation framework Hadoop and Power View display in GPS Vehicle Trajectories,” International Journal of Innovation and Applied Studies, vol. 16, no. 2, pp. 378–389, June 2016.