ENGINEERING RESEARCH PAPERS

big data spark IEEE PAPER




A Design of High-speed Big Data Query Processing System for Social DataAnalysis: Using Spark SQL
free download

Abstract Social network service generates formal, semi-formal and informal data continuously and such social data have complicated and diverse features. In these days massive social data are created in a real-time manner and the existing query processing

Evolutionary undersampling for extremely imbalanced big data classification under apache spark
free download

Abstract:The classification of datasets with a skewed class distribution is an important problem in data mining. Evolutionary undersampling of the majority class has proved to be a successful approach to tackle this issue. Such a challenging task may become even more

Big Data Analytics with Datalog Queries on Spark
free download

ABSTRACT There is great interest in exploiting the opportunity provided by cloud computing platforms for large-scale analytics. Among these platforms, Apache Spark is growing in popularity for machine learning and graph analytics. Developing efficient complex

SIDELOADING INGESTION OF LARGE POINT CLOUDS INTO THE APACHESPARK BIG DATA ENGINE
free download

ABSTRACT: In the geospatial domain we have now reached the point where data volumes we handle have clearly grown beyond the capacity of most desktop computers. This is particularly true in the area of point cloud processing. It is therefore naturally lucrative to

Spark-BDD: Debugging Big Data Applications
free download

Apache Spark has become a key platform for Big Data Analytics, yet it lacks complete support for debugging analytics programs. As a result, the development of a new analytical toolkit can be a painstakingly long process . To fill this gap, we are developing

Spark: the Next-generation Processing Engine for Big Data
free download

Despite its beauty in processing big data, RDD is still a little distal from the data structures that people are familiar with, eg, SQL schema, data frame. The recent release of Spark introduces Data-Frame4 into its ecosystem. The columnar organized data structure is

Review: Apache Spark and Big Data Analytics for Solving Real World Problems
free download

ABSTRACT Big Data analysis is having an impact on every industry today. Industry leaders are capitalizing on these new business insights to drive competitive advantage. Apache Hadoop is the most common Big Data Framework, but the technology in evolving rapidly

Big Data Analytics Hadoop and Spark
free download

Once connected, Spark acquires executors on nodes in the cluster, which are worker processes that run computations and store data for your application. Next, it sends your application code (JAR or Python files) to the executors. Finally, SC sends tasks for the

A REVIEW: MAPREDUCE AND SPARK FOR BIG DATA ANALYTICS
free download

ABSTRACT In this paper we discuss the various challenges of Big Data and problem arises due to continuous explosion of data resulting from the likes of social media and other online sources to gain access to deeper analysis of their data. This paper discusses two of the

Big Data Analytics on Cray XC Series DataWarp using Hadoop, Spark and Flink
free download



Performance Comparison of MySQL Cluster and Apache Spark for Big DataApplications
free download

Abstract Working with data involves two major factors, storing the data and performing computations by accessing the data. MySQL is the first Database Management Software that provided an effective and efficient method for data storage and computations. However,

A Review Study of Apache Spark in Big Data Processing
free download

ABSTRACT Why Spark becomes a hot topic in Big Data analytics Is really Apache Spark going to replace Hadoop If we involved seriously into Big Data analytics, then, should we really care about Spark Apache Spark is a lightning-fast cluster computing designed for

Big Data and Apache Spark: A Review
free download

Abstract:Big Data is currently a very burning topic in the fields of Computer Science and Business Intelligence, and with such a scenario at our doorstep, a humungous amount of information waits to be documented properly with emphasis on the market. By market, we

RDF Big Data Management on top of Spark
free download

Modern information and knowledge-centric applications produce, store, integrate, query, analyze and visualize rapidly growing data sets. Traditional data processing technologies (data management and warehousing systems) are inadequate for processing this data

Comparison of MapReduce and Spark Programming Frameworks for Big DataAnalytics on HDFS
free download

Abstract: Use of internet and all the types of computer automated systems generates large amount of data in different forms. Due to large volume, different types of varieties, and high velocity of this type of data emerges the Big Data Problem. Spark and MapReduce

Static and Dynamic Big Data Partitioning on Apache Spark.
free download

Abstract. Many of today's large datasets are organized as a graph. Due to their size it is often infeasible to process these graphs using a single machine. Therefore, many software frameworks and tools have been proposed to process graph on top of distributed

A Case Study Comparing Different Big-Data Handling Approaches Using Hadoop-Hive VS Spark-Shark
free download

ABSTRACT We have worked on the implementation of a simple analytics engine to process huge amount of Wikipedia dump on two different cluster platforms that compares two big data technologies Hadoop-Hive vs. Spark Shark. MapReduce's greatest strength is This book is a concise and easy-to-understand tutorial for big data and Spark. It will help you learn how to use Spark for a variety of big data analytic tasks. It covers everything that you need to know to productively use Spark. One of the benefits of purchasing this book is that

A Parallel Random Forest Algorithm for Big Data in a Spark Cloud Computing Environment
free download

Abstract:With the emergence of the big data age, the issue of how to obtain valuable knowledge from a dataset efficiently and accurately has attracted increasingly attention from both academia and industry. This paper presents a Parallel Random Forest (PRF)

Migrating GIS Big Data Computing from Hadoop to Spark: An Exemplary Study Using Twitter
free download

Abstract:Recent research has demonstrated that social media could provide valuable spatio-temporal data about users activities. However, information extraction and computation from big amount of data pose various challenges. To effectively process