Cassandra

Pre-Installation Setup

Before installing Cassandra in Linux environment, we require to set up Linux using ssh (Secure Shell). Follow the steps given below for setting up Linux environment.

Create a User

At the beginning, it is recommended to create a separate user for Hadoop to isolate Hadoop file system from Unix file system. Follow the steps given below to create a user.
  • Open root using the command “su”.
  • Create a user from the root account using the command “useradd username”.
  • Now you can open an existing user account using the command “su username”.
Open the Linux terminal and type the following commands to create a user.
$ su
password:
# useradd hadoop
# passwd hadoop
New passwd:
Retype new passwd

SSH Setup and Key Generation

SSH setup is required to perform different operations on a cluster such as starting, stopping, and distributed daemon shell operations. To authenticate different users of Hadoop, it is required to provide public/private key pair for a Hadoop user and share it with different users.
The following commands are used for generating a key value pair using SSH:
  • copy the public keys form id_rsa.pub to authorized_keys,
  • and provide owner,
  • read and write permissions to authorized_keys file respectively.
$ ssh-keygen -t rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
  • Verify ssh:
ssh localhost

Installing Java

Java is the main prerequisite for Cassandra. First of all, you should verify the existence of Java in your system using the following command:
$ java -version
If everything works fine it will give you the following output.
java version "1.7.0_71"
Java(TM) SE Runtime Environment (build 1.7.0_71-b13)
Java HotSpot(TM) Client VM (build 25.0-b02, mixed mode)
If you don’t have Java in your system, then follow the steps given below for installing Java.

Step 1

Download java (JDK <latest version> - X64.tar.gz) from the following link:
Then jdk-7u71-linux-x64.tar.gz will be downloaded onto your system.

Step 2

Generally you will find the downloaded java file in the Downloads folder. Verify it and extract the jdk-7u71-linux-x64.gz file using the following commands.
$ cd Downloads/
$ ls
jdk-7u71-linux-x64.gz
$ tar zxf jdk-7u71-linux-x64.gz
$ ls
jdk1.7.0_71 jdk-7u71-linux-x64.gz

Step 3

To make Java available to all users, you have to move it to the location “/usr/local/”. Open root, and type the following commands.
$ su
password:
# mv jdk1.7.0_71 /usr/local/
# exit

Step 4

For setting up PATH and JAVA_HOME variables, add the following commands to ~/.bashrc file.
export JAVA_HOME = /usr/local/jdk1.7.0_71
export PATH = $PATH:$JAVA_HOME/bin
Now apply all the changes into the current running system.
$ source ~/.bashrc

Step 5

Use the following commands to configure java alternatives.
# alternatives --install /usr/bin/java java usr/local/java/bin/java 2
# alternatives --install /usr/bin/javac javac usr/local/java/bin/javac 2
# alternatives --install /usr/bin/jar jar usr/local/java/bin/jar 2

# alternatives --set java usr/local/java/bin/java
# alternatives --set javac usr/local/java/bin/javac
# alternatives --set jar usr/local/java/bin/jar
Now use the java -version command from the terminal as explained above.

Setting the Path

Set the path of Cassandra path in “/.bahrc” as shown below.
[hadoop@linux ~]$ gedit ~/.bashrc

export CASSANDRA_HOME = ~/cassandra
export PATH = $PATH:$CASSANDRA_HOME/bin

Download Cassandra

Apache Cassandra is available at Download Link Cassandra using the following command.
$ wget http://supergsego.com/apache/cassandra/2.1.2/apache-cassandra-2.1.2-bin.tar.gz
Unzip Cassandra using the command zxvf as shown below.
$ tar zxvf apache-cassandra-2.1.2-bin.tar.gz.
Create a new directory named cassandra and move the contents of the downloaded file to it as shown below.
$ mkdir Cassandra
$ mv apache-cassandra-2.1.2/* cassandra.

Configure Cassandra

Open the cassandra.yaml: file, which will be available in the bin directory of Cassandra.
$ gedit cassandra.yaml
Note: If you have installed Cassandra from a deb or rpm package, the configuration files will be located in /etc/cassandra directory of Cassandra.
The above command opens the cassandra.yaml file. Verify the following configurations. By default, these values will be set to the specified directories.
  • data_file_directories “/var/lib/cassandra/data”
  • commitlog_directory “/var/lib/cassandra/commitlog”
  • saved_caches_directory “/var/lib/cassandra/saved_caches”
Make sure these directories exist and can be written to, as shown below.

Create Directories

As super-user, create the two directories /var/lib/cassandra and/var./log/cassandra into which Cassandra writes its data.
[root@linux cassandra]# mkdir /var/lib/cassandra
[root@linux cassandra]# mkdir /var/log/cassandra

Give Permissions to Folders

Give read-write permissions to the newly created folders as shown below.
[root@linux /]# chmod 777 /var/lib/cassandra
[root@linux /]# chmod 777 /var/log/cassandra

Start Cassandra

To start Cassandra, open the terminal window, navigate to Cassandra home directory/home, where you unpacked Cassandra, and run the following command to start your Cassandra server.
$ cd $CASSANDRA_HOME
$./bin/cassandra -f 
Using the –f option tells Cassandra to stay in the foreground instead of running as a background process. If everything goes fine, you can see the Cassandra server starting.

Programming Environment

To set up Cassandra programmatically, download the following jar files:
  • slf4j-api-1.7.5.jar
  • cassandra-driver-core-2.0.2.jar
  • guava-16.0.1.jar
  • metrics-core-3.0.2.jar
  • netty-3.9.0.Final.jar
Place them in a separate folder. For example, we are downloading these jars to a folder named “Cassandra_jars”.
Set the classpath for this folder in “.bashrc”file as shown below.
[hadoop@linux ~]$ gedit ~/.bashrc

//Set the following class path in the .bashrc file.

export CLASSPATH = $CLASSPATH:/home/hadoop/Cassandra_jars/*

Eclipse Environment

Open Eclipse and create a new project called Cassandra _Examples.
Right click on the project, select Build Path→Configure Build Path as shown below.
Cassandra Build Path
It will open the properties window. Under Libraries tab, select Add External JARs. Navigate to the directory where you saved your jar files. Select all the five jar files and click OK as shown below.
Cassandra External JARs
Under Referenced Libraries, you can see all the required jars added as shown below:
Eclipse3

Maven Dependencies

Given below is the pom.xml for building a Cassandra project using maven.
<project xmlns = "http://maven.apache.org/POM/4.0.0" 
   xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance"  
   xsi:schemaLocation = "http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
   <build>
      <sourceDirectory>src</sourceDirectory>
      <plugins>
         <plugin>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>3.1</version>
    
               <configuration>
                  <source>1.7</source>
                  <target>1.7</target>
               </configuration>
     
         </plugin>
      </plugins>
   </build> 

   <dependencies>
      <dependency>
         <groupId>org.slf4j</groupId>
         <artifactId>slf4j-api</artifactId>
         <version>1.7.5</version>
      </dependency>
 
      <dependency>
         <groupId>com.datastax.cassandra</groupId>
         <artifactId>cassandra-driver-core</artifactId>
         <version>2.0.2</version>
      </dependency>
 
      <dependency>
         <groupId>com.google.guava</groupId>
         <artifactId>guava</artifactId>
         <version>16.0.1</version>
      </dependency>
 
      <dependency>
         <groupId>com.codahale.metrics</groupId>
         <artifactId>metrics-core</artifactId>
         <version>3.0.2</version>
      </dependency>
 
      <dependency>
         <groupId>io.netty</groupId>
         <artifactId>netty</artifactId>
         <version>3.9.0.Final</version>
      </dependency>
   </dependencies>

</project>



Install Cassandra on the Server

On the server machine only:

sudo apt update sudo apt install cassandra -y

Check status:

sudo systemctl status cassandra

3️⃣ Configure Cassandra to Accept Remote Connections

Edit Cassandra config:

sudo nano /etc/cassandra/cassandra.yaml

๐Ÿ”น Set listen_address

listen_address: <SERVER_IP>

Example:

listen_address: 192.168.1.100

๐Ÿ”น Set rpc_address

rpc_address: 0.0.0.0

๐Ÿ”น Set broadcast_rpc_address

broadcast_rpc_address: <SERVER_IP>

Save and restart:

sudo systemctl restart cassandra

4️⃣ Enable Username / Password Authentication

Edit:

sudo nano /etc/cassandra/cassandra.yaml

๐Ÿ”น Enable authenticator & authorizer

authenticator: PasswordAuthenticator authorizer: CassandraAuthorizer

Restart Cassandra:

sudo systemctl restart cassandra

5️⃣ Login as Default Admin

By default Cassandra has:

  • username: cassandra

  • password: cassandra

Login:

cqlsh <SERVER_IP> -u cassandra -p cassandra

6️⃣ Create Users (Roles) for Clients

๐Ÿ”น Create user for Client 1

CREATE ROLE user1 WITH PASSWORD = 'user1pass' AND LOGIN = true;

๐Ÿ”น Create user for Client 2

CREATE ROLE user2 WITH PASSWORD = 'user2pass' AND LOGIN = true;

๐Ÿ”น Create user for Client 3

CREATE ROLE user3 WITH PASSWORD = 'user3pass' AND LOGIN = true;

7️⃣ Grant Permissions (Very Important)

Example: Give access to a keyspace

GRANT ALL PERMISSIONS ON KEYSPACE mykeyspace TO user1; GRANT SELECT ON KEYSPACE mykeyspace TO user2; GRANT MODIFY ON KEYSPACE mykeyspace TO user3;

๐Ÿ‘‰ You can restrict:

  • SELECT → read only

  • MODIFY → insert/update

  • ALL PERMISSIONS → full access




✅ Step 1: Check Existing Keyspaces

Run:

DESCRIBE KEYSPACES;

✅ Step 2: Create the Keyspace (If Missing)

If college_db is NOT listed, create it:

CREATE KEYSPACE college_db WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': 1 };




CREATE ROLE students WITH LOGIN = false;
✔ This creates a group role
✔ Cannot log in
✔ Used only for permissions

✅ Now Grant Permissions to the Group
sql
Copy code
GRANT SELECT ON KEYSPACE college_db TO students;

✅ Add Users to the Group

GRANT students TO user1;




Download cassandra from the following link

https://archive.apache.org/dist/cassandra/


๐Ÿ”น 1. cqlsh (Command Line)

  • Works perfectly on Windows

  • Each user logs in with own username/password

Go to bin folder of cassandra in windows and type


py -3.11 cqlsh.py <ip_address_server> -u user1 -p user1




1️⃣ CREATE (Insert Data)

๐Ÿ”น Create Keyspace

CREATE KEYSPACE college_db WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': 1 };

๐Ÿ”น Create Table

CREATE TABLE college_db.students ( student_id int, name text, department text, year int, PRIMARY KEY (student_id) );

๐Ÿ”น Insert Data

INSERT INTO college_db.students (student_id, name, department, year) VALUES (1, 'Anu', 'CSE', 2);

✔ No COMMIT needed
✔ Insert = Create data


2️⃣ READ (Select Data)

๐Ÿ”น Select All Rows

SELECT * FROM college_db.students;

๐Ÿ”น Select Specific Columns

SELECT name, department FROM college_db.students;

๐Ÿ”น Select with WHERE (Partition Key Required)

SELECT * FROM college_db.students WHERE student_id = 1;

⚠ Cassandra requires PRIMARY KEY in WHERE.


๐Ÿ”น Limit Rows

SELECT * FROM college_db.students LIMIT 5;

3️⃣ UPDATE (Modify Data)

UPDATE college_db.students SET department = 'ECE', year = 3 WHERE student_id = 1;

✔ No UPDATE without WHERE
✔ WHERE must include primary key


4️⃣ DELETE (Remove Data)

๐Ÿ”น Delete a Row

DELETE FROM college_db.students WHERE student_id = 1;

๐Ÿ”น Delete a Column

DELETE department FROM college_db.students WHERE student_id = 2;

๐Ÿ”น Delete Entire Table

DROP TABLE college_db.students;

๐Ÿ”น Delete Keyspace

DROP KEYSPACE college_db;

๐Ÿงช Extra Useful Commands

๐Ÿ”น Show Tables

DESCRIBE TABLES;

๐Ÿ”น Show Table Structure

DESCRIBE TABLE college_db.students;

๐Ÿ”น Truncate Table (Delete all rows)

TRUNCATE college_db.students;




Comments

Popular posts from this blog