Date Spillage Remediation Techniques in Hadoop

National Cyber Summit 2017

Event Title

Date Spillage Remediation Techniques in Hadoop

Presenter Information

Srinivas Jantali, Northeastern UniversityFollow
Sunanda Mani, Northeastern UniversityFollow

Location

Huntsville (Ala.)

Start Date

6-7-2017

Presentation Type

Paper

Description

Hadoop implements its own file system HDFS (Hadoop Distributed File System) designed to run on commodity hardware to store and process large data sets. Data Spillage is a condition where a data set of higher classification is accidentally stored on a system of lower classification. When deleted, the spilled data remains forensically retrievable due to the fact that file systems implement deletion by merely resetting pointers and marking corresponding space as available. The problem is augmented in Hadoop as it is designed to create and store multiple copies of same data to ensure high availability, thereby increasing risk to confidentiality of data. This paper proposes three approaches to eliminate such risk. In the first approach, the spilled data is securely overwritten with zero and random fills multiple times at the OS level, to render it forensically irretrievable. In the second approach, the Hadoop inbuilt delete function is enhanced to implement a secure deletion mechanism. In the third approach, the hard drives of the data nodes which have spilled data is replaced with new ones after destroying the old drives. This paper also evaluates all 3 approaches to arrive at an optimal solution which is implementable in a large scale production environment.

Recommended Citation

Jantali, Srinivas and Mani, Sunanda, "Date Spillage Remediation Techniques in Hadoop" (2017). National Cyber Summit. 6.
https://louis.uah.edu/cyber-summit/ncs2017/ncs2017papers/6

Download

COinS

Jun 7th, 12:00 AM

Date Spillage Remediation Techniques in Hadoop

Huntsville (Ala.)

National Cyber Summit 2017

Event Title

Presenter Information

Location

Start Date

Presentation Type

Description

Recommended Citation

Search

Browse

Author Corner

M. Louis Salmon Library

National Cyber Summit 2017

Event Title

Presenter Information

Location

Start Date

Presentation Type

Description

Recommended Citation

Share

Search

Browse

Author Corner

M. Louis Salmon Library