Evaluation of Machine Learning Frameworks on Bank Marketing and Higgs Datasets

Shashidhara, B.M.; Jain, S.; Rao, V.D.; Patil, N.; Raghavendra, G.S.

Please use this identifier to cite or link to this item: http://idr.nitk.ac.in/jspui/handle/123456789/8472

Full metadata record

DC Field	Value	Language
dc.contributor.author	Shashidhara, B.M.
dc.contributor.author	Jain, S.
dc.contributor.author	Rao, V.D.
dc.contributor.author	Patil, N.
dc.contributor.author	Raghavendra, G.S.
dc.date.accessioned	2020-03-30T10:18:47Z	-
dc.date.available	2020-03-30T10:18:47Z	-
dc.date.issued	2015
dc.identifier.citation	Proceedings - 2015 2nd IEEE International Conference on Advances in Computing and Communication Engineering, ICACCE 2015, 2015, Vol., , pp.551-555	en_US
dc.identifier.uri	http://idr.nitk.ac.in/jspui/handle/123456789/8472	-
dc.description.abstract	Big data is an emerging field with different datasets of various sizes are being analyzed for potential applications. In parallel, many frameworks are being introduced where these datasets can be fed into machine learning algorithms. Though some experiments have been done to compare different machine learning algorithms on different data, these experiments have not been tested out on different platforms. Our research aims to compare two selected machine learning algorithms on data sets of different sizes deployed on different platforms like Weka, Scikit-Learn and Apache Spark. They are evaluated based on Training time, Accuracy and Root mean squared error. This comparison helps us to decide what platform is best suited to work while applying computationally expensive selected machine learning algorithms on a particular size of data. Experiments suggested that Scikit-Learn would be optimal on data which can fit into memory. While working with huge, data Apache Spark would be optimal as it performs parallel computations by distributing the data over a cluster. Hence this study concludes that spark platform which has growing support for parallel implementation of machine learning algorithms could be optimal to analyze big data. � 2015 IEEE.	en_US
dc.title	Evaluation of Machine Learning Frameworks on Bank Marketing and Higgs Datasets	en_US
dc.type	Book chapter	en_US
Appears in Collections:	2. Conference Papers

Files in This Item:

There are no files associated with this item.

Show simple item record