Bloom Filters – Implementation in Java

By  Ananth Kumar Ganesna

Introduction:

A Bloom filter is a very compact data structure that supports approximate membership queries on a set, allowing false positives.

The term Bloom filter names a data structure that supports membership queries on a set of elements. It was introduced by Burton Bloom [1970]. It differs from ordinary dictionary data structures, as the result of a membership query might be “true” although the element is not actually contained in the set. Since the data structure is randomized by using hash functions, reporting a false positive occurs with a certain probability, called the false positive rate (FPR).

Use cases:

Cache is important to handle CPU intensive operations to store and quickly access the the result  of such operations.

Sometimes the due to changes  in IO, you do not want to hit the database repeatedly and would prefer  to cache the results and update the cache only with underlying data changes.

Similarly there are other use cases where we need to perform a quick look up to decide what to do with an incoming request. For example, consider the use case where you have to identify that an URL points to a malware site or not. There could be many URLs like that, to do it in one instance, if we cache all the malware URLs in memory, that would require a lot of space to hold them.

Another use case could be to identify if a user typed string has any reference to a place in USA. Like “museum in Washington” – in this string, Washington is a name of a place in USA. Should we keep all the places in USA in memory and then lookup? How big the cache size would be? Is it effective to do it without any database support?

This is where we need to move away from basic map data structure and look for answers in more advanced data structure like Bloom Filter.

You can consider bloom filter, like any other java collection where you can put items in it and ask it whether an item already present in it or not (like a HashSet). If Bloomfilter mentions that it does not contain the item, then definitely that item is not present. But if it mentions that it has seen the item, then that may be wrong ( False positive ). If we are careful enough, we can design a bloom filter such that the probability of the wrong is controlled. (more…)

Read More

Selenium WebDriver – Installation & Configuration

By Sandhya

Selenium-WebDriver  is  an elegant programming interface and a compact Object Oriented API. It is an efficient tool for testing web applications  in different browsers irrespective of the programming language of the application. WebDriver makes direct calls to browser using each browser’s native support.

I will discuss here the installation and configuration of WebDriver, illustrating the process with a simple example.

Step 1:

Check  whether java is installed in your machine   [enter  command –   java – version in command prompt] . If your machine is not having java ; download and install java jdk [download JDK].

 

fig1

Now set java path.  [How to set path]

Step 2:

Download and install Eclipse IDE [ Download Eclipse IDE] . Extract the downloaded Zip file, Installation  is not required to use Eclipse.

fig2

Step 3:

Download Selenium JavaClientDriver  [JavaClientDriver]; extract the JavaClientDriver.

fig3 (more…)

Read More

Homage to Dr APJ Abdul Kalam

Bimarian team salutes Dr APJ Abdul Kalam, the epitome of humanity. It is befitting and honor to the world that United Nations declared  Kalamji’s  birthday – October 15 as World Students’ Day

We  all love to see that all speeches, quotes and books of this Great Teacher, become part of the educational curriculum right from  primary school level. We trust the academicians in the world would have already initiated this. Wish Indian educationists  be harbingers.

We appeal to the community of Information Technology to share the thoughts, of this YOGI of humanity , with the kids –  own children, nephews and nieces as anecdotes, from  his books – Wings of Fire, India 2020 and Ignited Minds which are  believed to be precursors of thought of how important is a livable planet earth.

We solemnly pray that Heaven enjoys the benefit of the presence of this great soul of wisdom that helps Creating a Livable Planet Earth.

Bimarian Team

Read More

Installing Cloud Foundry BOSH Command Line Interface (CLI) on Centos 7

By Harikrishna Doredla

Cloud Foundry BOSH is an open source tool. BOSH Command Line Interface (CLI) is used to interact with the Director and to bootstrap new BOSH environments. The CLI is written in Ruby and is provided by the two gems:
bosh_cli contains main operator commands.
bosh_cli_plugin_micro contains bootstrapping commands.

Install the two Gems following the steps below:  (more…)

Read More

Building Fault-Tolerant Web Application on AWS

By Hari Doredla

Problem:

Star Interactive platform is a web application that provides for interactive exchange of messages between a celebrity and fans. Some changes are brought out into this game. The celebrities tweet messages that should reach their millions of fans across the globe  and the fans start tweeting back the messages to the celebrity.  As the fans started tweeting back the messages , the load on the web server spiked very and the web server was not responding to the requests from the massive number of fans.

Solution:

We chose the AWS cloud as the best platform to build web applications to get better performance with minimal cost and high availability (fault-tolerant) rather than scaling up the web servers using squid as Load Balancer.

AWS: Amazon Web Services is a collection of remote computing services that make up a cloud computing platform, offered over the Internet by Amazon.com… (more…)

Read More

Install and Configure PostgreSQL on Centos

By Harikrishna Doredla

Here is how you go with installation of Postgres 9.3 on Centos 6.5

  • Check Centos server version.
    root@hadoop3 init.d]# cat /etc/redhat-release — shows the server version
    CentOS release 6.5 (Final)
    [root@hadoop3 init.d]#
  1. Edit the repo file – /etc/yum.repos.d/CentOS-Base.repo, [base] and [updates] sections
    In the base and updates sections append a line as below:
    exclude=postgresql*
  2. rpm -Uvh http://yum.postgresql.org/9.3/redhat/rhel-6-x86_64/pgdg-centos93-9.3-1.noarch.rpm
  3. yum list postgres*
  4. yum install postgresql93-server postgresql93
  5. service postgresql-9.3 initdb
  6. service postgresql-9.3 status
  7. chkconfig –add postgresql-9.3
  8. chkconfig postgresql-9.3 on
  9. service postgresql-9.3 start
  10. su – postgres
  11. psql –version
    psql (PostgreSQL) 9.3.3
  12. Setting password for the first time login after installation:
    psql
    ALTER USER postgres with encrypted password ‘postgres';
  13. Enabling remote access:

(more…)

Read More

Kerberos authentication with Windows Active Directory

By Harikrishna Doredla

Enable the Kerberos authentication in active directory.

  • Domain controller by default is enabled for kerberos service delegation.
  • Start-> Active Directory users and the computers -> select your domain -> domain controllers -> right click on your default-first-site-name and click on properties menu  will snap-in then select the Delegation tab (please see in below figure).
  • Make sure “Trust this computer for delegation to any service (Kerberos only)” option is enabled. ..

(more…)

Read More

Testing web applications with Selenium IDE:

By Sandhya

Selenium is Open Source automation tool for Functional and Regression testing of web applications. It allows users to test across multiple browsers and platforms.  We can write Selenium scripts in many programming languages such as Java, PHP, Python, Ruby, Perl and C#.

Selenium has a suite of tools:

  • Selenium Core
  • Selenium 1/Selenium RC
  • Selenium 2/Selenium WebDriver
  • Selenium Grid
  • Selenium IDE

I use Selenium IDE (Integrated Development Environment). It is  a Mozilla Firefox plugin [Add-on] and is  easy to install. .. (more…)

Read More

Installing kerberized PostgreSQL (on Centos 6.5)

By Harikrishna Doredla

Kerberos is a network authentication protocol. It allows client / server communication over non-secure network using secret- key-cryptography ( “tickets”) to identity one another  (client and server). Once the server and the client mutually authenticate their identity , the privacy and data integrity of communications between client and server are assured. No hassle of hacking passwords in the network traffic.

Here is how I installed  on Centos 6.5 , this highly secure solution from MIT ( Michigan Institute of Technology) to achieve kerberized PostgresSQL. .. (more…)

Read More