An improvement method of DBSCAN algorithm on cloud computing
(چکیده مقاله) :
Abstract :
DBSCAN is a density-based data clustering algorithm, in image processing, data mining, machine learning and other fields are
widely used. With the increasing of the size of clusters, the parallel DBSCAN algorithm is widely used. in image processing, data
mining, machine learning and other fields are widely used. However, we consider current partitioning method of DBSCAN is too
simple and steps of GETNEIGHBORS query repeatedly access the data set on spark. So we proposed DBSCAN-PSM which
applies new data partitioning and merging method. In the first stage of our method we import the KD-Tree, combine the
partitioning and GETNEIGHBORS query, reduce the number of access to the data set and decrease the influence of I/O in algorithm. In the second stage of our method we use the feature of points in merging so as to avoid the time costing of the global
label. Experimental results showed that our new method can improve the parallel efficiency and the clustering algorithm
performance
widely used. With the increasing of the size of clusters, the parallel DBSCAN algorithm is widely used. in image processing, data
mining, machine learning and other fields are widely used. However, we consider current partitioning method of DBSCAN is too
simple and steps of GETNEIGHBORS query repeatedly access the data set on spark. So we proposed DBSCAN-PSM which
applies new data partitioning and merging method. In the first stage of our method we import the KD-Tree, combine the
partitioning and GETNEIGHBORS query, reduce the number of access to the data set and decrease the influence of I/O in algorithm. In the second stage of our method we use the feature of points in merging so as to avoid the time costing of the global
label. Experimental results showed that our new method can improve the parallel efficiency and the clustering algorithm
performance
(توضیحات تکمیلی) :
(توضیحات تکمیلی) :
Description :
مقاله ISI انگلیسی اصلی
سال انتشار:2019
فایل ISI انگلیسی اصلی ، با فرمت Pdf
تعداد صفحات فایل ISI انگلیسی اصلی: 9 صفحه
سال انتشار:2019
فایل ISI انگلیسی اصلی ، با فرمت Pdf
تعداد صفحات فایل ISI انگلیسی اصلی: 9 صفحه
Authors / Descriptions(نویسندگان/توضیحات): سال انتشار 2019 - مقاله ISI / نویسندگان: Weipeng Jinga , Chuanyu Zhao , Chao Jiang
Sent date(تاریخ ارسال) :
1398/02/27 | 5/17/2019
Number of visits(تعداد بازدید):
936
Key words (کلمات کلیدی):
Big data, DBSCAN, Data partitioning, Data merging
Number of pages(تعداد صفحات) :
9
نظرات کاربران در مورد این آگهی | |
در حال حاضر هیچ نظری ثبت نگردیده است .
|