Posts

Showing posts from December, 2023

Hadoop uses the concept of parallelism to upload the split data while fulfilling Velocity problem.

Image
  TASK-DESCRIPTION: - 🔷According to popular articles, Hadoop uses the concept of parallelism to upload the split data while fulfilling Velocity problem. 👉🏻 Research with your teams and conclude this statement with proper proof ✴️Hint: tcpdump >>tcpdump  is a most powerful and widely used command-line package analyzer tool which is used to capture or filter TCP/IP packets that recieved or transferred over a network on a specific interface. It also gives us a option to save captured packets in a file for future analysis. For this task I have created a cluster and tested the way of packets flow with the tcpdump Step 1 : - We have to upload the data from any client then we can observer “how the packets are getting transferred” Step 2:  - And also, we can read the file to observe in what way the files are getting read from the Hadoop cluster Conclusion : - I found Client is uploading data in only first Data node and rest replications are made by all Data nodes, like I...

In a Hadoop cluster, find how to contribute limited/specific amount of storage as slave to the cluster?

Image
  In a Hadoop cluster, find how to contribute limited/specific amount of storage as slave to the cluster? Step 1: Identify Available Storage Before you embark on allocating storage to your Hadoop cluster, it’s crucial to have a clear understanding of the existing storage resources on the slave node. The goal is to identify the disk or partition that you intend to contribute to the Hadoop cluster. Here’s a more detailed breakdown: 1.1 Check Current Disk Space: Begin by using the  df  (disk free) command to display information about the current disk space on the slave node. df -h The  -h  flag stands for "human-readable," making the output more easily understandable. This command provides an overview of the existing mounted filesystems along with their sizes, used space, and available space. 1.2 Identify the Disk or Partition: Analyze the output of the  df  command to identify the disk or partition you want to allocate to the Hadoop cluster. Disks are ty...