Vagrant Error: Failed to connect to atlas.hashicorp.com

vagrantIn this article:

  • Solution to very common error while downloading VM through vagrant.

Problem:

While running command vagrant up, following error occurs-

C:\_vagrantwork\proj2>vagrant box add hashicorp/precise32 The box ‘hashicorp/precise32’ could not be found or could not be accessed in the remote catalog. If this is a private box on HashiCorp’s Atlas, please verify you’re logged in via `vagrant login`. Also, please double-check the name. The expanded URL and error message are shown below:

URL: “https://atlas.hashicorp.com/hashicorp/precise32”
Error: Failed to connect to atlas.hashicorp.com port 1080: Timed out

Background:

  • This generally comes on corporate network due to proxy settings.

Solution:

     Step 1: Get proxy

  • For windows 7 or higher- Execute netsh winhttp show proxy.
  • Look for output under “Proxy Server(s)” it should be like <proxy_string>:<proxy_port>.
  • Copy the string.

    Step 2: Get the IP

  • Ping <proxy_string>
  • And copy the IP address (let’s say it proxy_ip)

    Step 3: Set variables (you may set it in environment variables area for permanent fix)

  • set HTTPS_PROXY=https://<proxy_ip>:<proxy_port>
  • set HTTP_PROXY=http://<proxy_ip>:<proxy_port>

    Step 4: Run your command

  • Issue vagrant command.

Happy Reading!!

Nirbhya Bhava!!

BigData Keywords: A BigData User Dictionary for related technologies definitions and much more

big-data-straight-ahead

In this article you will learn

  • This article will have jargons from BigData related technologies.
  • It will give you best possible article links and books to understand topics more deeply.
  • As BigData related technologies are still evolving, this article will keep updating time by time based on my personal experience, learning and findings.

Hadoop characteristics

  • Hadoop is “data-parallel”, but “process-sequential”. Within a job, parallelism happens within a map phase as well as a reduce phase. But these two phases cannot run in parallel, the reduce phase cannot be started until the map phase is fully completed.
  • All data being accessed by the map process need to be freeze (update cannot happen) until the whole job is completed. This means Hadoop processes data in chunks using a batch-oriented fashion, making it not very suitable for stream-based processing where data flows in continuously and immediate processing is needed.
  • Data communication happens via a distributed file system (HDFS). Latency is introduced as extensive network I/O is involved in moving data around (i.e.: Need to write 3 copies of data synchronously). This latency is not an issue for batch-oriented processing where throughput is the primary factor. But this means Hadoop is not suitable for online access where low latency is critical.

Hadoop is NOT good at the following

  • Perform online data access where low latency is critical (Hadoop can be used together with HBase or NOSQL store to deliver low latency query response)
  • Perform random ad/hoc processing of a small subset of data within a large data set (Hadoop is designed to scan all data in parallel)
  • Process small data volume (for data volume less than hundred GB range, many more mature solutions exist)
  • Perform real-time, stream-based processing where data is arrived continuously and immediate processing is needed (to keep the overhead small enough, typically data need to be batched for at least 30 minutes, which you won’t be able to see the current data until 30 minutes has passed)

Ref: http://horicky.blogspot.com/2009/11/what-hadoop-is-good-at.html

How Hadoop works

  1. Data broken into pieces of 64 or 128 MB blocks.
  2. Blocks moved to each node.
  3. Job Tracker start scheduler to track each node output.
  4. When all node done, final output generated.

Keywords in BigData Technologies

Data Locality Move computation closer to data to avoid network congestion.
GPU Graphic Processing Unit
Big Data 3V’s (Volume, Velocity, Variety)

Velocity- Rate to which data grows

Variety- Kind of data formats.

Ref: http://www.hadoopinrealworld.com/what-is-big-data/ [Read it, example is awesome]

Big Data Problem 1.     How to store and compute efficiently

2.     Data analysis- How fast analysis one

3.     Total cost in doing above 2 steps

RDBMS scalability issue 1.     De-normalize and pre aggregate the data for faster query execution time is needed for Big Data which is main issue with RDBMS.

2.     Changes to indexes and query optimization time by time.

3.     No horizontal scalability- meaning can’t add more hardware to bring down computation time rather query tuning.

4.     RDBMS are for structured data.

RDD Model
  • Resilient Distributed Datasets, Spark introduced this concept- An immutable, fault tolerant distributed collection of objects that can be operated in parallel.
  • RDDs are collections of objects that are partitioned across the cluster and can be stored on the nodes
  • They’re built through graphs of parallel transformations, such as Map-reduce, and group-by, similar to the graphs that used to compute results in Dryad. And RDDs are automatically rebuilt on failure by the runtime system.
  • Spark offers this abstraction embedded in several programming languages, including Java Scala, and Python.
Data Model A way to store data in database
Replication Copy of same data from one to another node, for availability
Majority read/write
Latency Delay from input into a system to desired output
Hadoop A distributed system (with master-slave configuration) to handle Big data which typically solves following problems-

  • Data transportation
  • Scaling up and down
  • Handles partial failures of application
Hadoop core components
  1. HDFS [for storage]
  2. Map-Reduce [for processing]
Hadoop Cluster A set of machines which executes Hadoop’s core components – HDFS and Map-Reduce
Node
  • A single machine in Hadoop cluster
  • And each node contains HDFS + Map-Reduce
Name Node Hadoop node having HDFS on master i.e. node which stores data for master
Data Node Hadoop node having HDFS on slave
Job Tracker Hadoop node having Map-Reduce on master
Task Tracker Hadoop node having Map-Reduce on slave
Apache Spark Open source distributed computing engine/framework for data processing and analytics. Its part of Hadoop technologies.

It supports verity of datasource (Kafka, MongoDB, HDFs, and Hive etc.), environments (Spring, Docker, Hadoop, OpenStack etc.) and applications (Mahout, Hive, and Thunder, Sparkling).

Spark has several components- Core, SQL, Streaming, MLlib

Spark Core is the base engine which supports-

  • Memory management
  • Fault recovery
  • Tasks management (schedule, distribute, monitor)
  • Storage system interaction

Spark supports- Iterative, Interactive and Batch data processing.

Note– Hadoop MapReduce (written in Java) is limited to batch data processing. While Hadoop MapReduce  stores data in disk, Spark stores in-memory hence Spark (written in Scala) is more of real time data processing.

Apache Mahout A machine learning library for Hadoop

 

What Next?
This space will keep update with time. Keep an eye on this, and I promise to share best of BigData technologies related information.

Happy Learning!!

Nirbhaya Bhava!!

How to generate OVF file and import to vSphere client.

Working with Virtual Machines (VM) – VMware workstation and vSphere client

Part 1: How to generate OVF file and import to vSphere client.

In this article you will learn:

  • This will be series of articles which will cover different areas of working with VM.
  • This particular article will help to generate VM when you are trying to migrate from VMware workstation to vSphere client.
  • This article will show how to generate OVF file.
  • And how to deploy OVF file to vSphere client.

Problem Statement:

You have ISO file and VMDK file, and want to create OVF file.

The Background:

Now a days, people love to use vSphere client rather VMware workstation. VMware workstation uses ISO and/or VMX to create VM. But vSphere client needs OVF file, which can be deploy and create VM.

Explore this:

Phase 1: Create OVF file-

  1. Open VMWare workstation.
  2. Select the VM for which OVF file needs to be generated.
  3. Go to “File” menu of VMWare workstation and choose option “Export to OVF”

2

4. Choose target directory for export and wait for completion.

Phase 2: Import OVF file-

  1. Login to vSphere client.
  2. Click on “File” menu and select option “Deploy OVF template”.

3

3. Browse the location of exported OVF and click OK

4. Wait for a while to get it complete.

Print this article:

Working with Virtual Machines

What next:

In next article will share how to increase size of disk of VM. Till then happy reading.

Nirbhaya Bhava!!

Create User Defined Linked List with Sorting Facility

In this article you will learn:

  • Create user defined linked list.
  • Use of generics, Enum and Comparable.
  • Basic design principle.

maxresdefault

Problem Statement:

Create custom linked list (i.e. don’t use java’s default linked list) with sorting facility. Sorting facility means data should be inserted in the specified order either ascending or descending.

The Background:

Now days, one of the most famous question is to create your own linked list. Also data should be inserted in some specified order.

Techniques used to impress Interviewer:

  • Followed code to interface design principle- to create generalized API.
  • Use of generics to make code independent of specific data type.
  • Use of ENUM’s to provide type safety.
  • Use of Comparable- to provide sorting facility.

Explore this:

Let’s start with basics. We will create following classes in given order:

  1. Concrete Class: Node– This is most basic part of linked list which contains data and information about next node.
  2. Enum: SortOder– This ENUM class will contain sort orders to provide.
  3. Interface: CustomLinkedList– This will be template creating methods for our linked list.
  4. Concrete class: CustomLinkedListimpl– This will have logic to create our linked list using class node.
  5. ConcreteClass: CustomLLClient– This will client class to show created linked list.

To learn basics of linked please follow this.

The code is as follows:

 

package customlinkedlist;

public class Node<T> {
 
 private Comparable<T> data;
 private Node<T> next;
 
 public Comparable<T> getData() {
 return data;
 }
 public void setData(Comparable<T> data) {
 this.data = data;
 }
 public Node<T> getNext() {
 return next;
 }
 public void setNext(Node<T> next) {
 this.next = next;
 }
}
 
package customlinkedlist;

public enum SortOrder {
 ASC,
 DSC
}
package customlinkedlist;

public interface CustomLinkedList<T> {
 
 Node<T> insert(Comparable<T> data);
}

 
package customlinkedlist;

public class CustomLinkedListimpl<T> implements CustomLinkedList<T> {
 
 private Node<T> head;
 private SortOrder sortOder;
 
 public CustomLinkedListimpl(SortOrder sortOrder) {
 head = new Node<T>();
 head.setNext(null);
 head.setData(null);
 this.setSortOder(sortOrder);
 }
 
 public Node<T> insert(Comparable<T> data) {
 
 Node<T> nodeToInsert = new Node<T>();
 
 if (head.getNext() == null) {
 
 nodeToInsert.setData(data);
 nodeToInsert.setNext(null);
 
 head.setNext(nodeToInsert);
 } else {
 
 Node<T> tempNode = head;
 while(tempNode.getNext() != null) {
 
 if (SortOrder.ASC.equals(this.getSortOder()) && data.compareTo((T) tempNode.getNext().getData()) < 0) {
 break;
 } else if (SortOrder.DSC.equals(this.getSortOder()) && data.compareTo((T) tempNode.getNext().getData()) > 0) {
 break;
 }
 tempNode = tempNode.getNext();
 }
 
 nodeToInsert.setData(data);
 nodeToInsert.setNext(tempNode.getNext());
 
 tempNode.setNext(nodeToInsert);
 }
 return nodeToInsert;
 }
 
 public void printList() {
 
 if (head == null) {
 System.out.println("List is null.");
 return;
 }
 
 Node<T> tempNode = head.getNext();
 System.out.print(tempNode.getData());
 
 while (tempNode.getNext() != null) {
 tempNode = tempNode.getNext();
 System.out.print(" --> " + tempNode.getData());
 }
 }

 public SortOrder getSortOder() {
 return sortOder;
 }

 public void setSortOder(SortOrder sortOder) {
 this.sortOder = sortOder;
 }
}

 
package customlinkedlist;

public class CustomLLClient {

 public static void main(String[] args) {
 
 System.out.println("Integer LinkedList");
 CustomLinkedListimpl<Integer> intcustomLinkedList = new CustomLinkedListimpl<Integer>(SortOrder.ASC);
 intcustomLinkedList.insert(8);
 intcustomLinkedList.insert(3);
 intcustomLinkedList.insert(5);
 intcustomLinkedList.insert(2);
 intcustomLinkedList.printList();
 
 System.out.println("\n\nString LinkedList");
 CustomLinkedListimpl<String> customLinkedList = new CustomLinkedListimpl<String>(SortOrder.DSC);
 customLinkedList.insert("Tango");
 customLinkedList.insert("Alpha");
 customLinkedList.insert("Ram");
 customLinkedList.insert("Romeo");
 customLinkedList.printList();
 }
}

The result is:

 Integer LinkedList</pre>
<pre>2 --> 3 --> 5 --> 8

String LinkedList
Tango --> Romeo --> Ram --> Alpha

What Next:

Suggestions are most welcome. Let me know if there is any doubt as well.

Print this Article:

Printer friendly version of this document is available: Create custom linked list

Happy Reading.

Nirbhaya Bhava!!

Tanzeem.

JVM JRE JDK- What’s the relation

Image

In this article:

  • Learn common confusion among JVM, JRE and JDK.
  • This will help to understand processing of a java class.

The background:

A java programmer writes java file. And its gets executed on any environment. Have you ever thought how? Once programmer done with writing java file, his/her job is done. He/She nothing to worry about what all needed to run java program.  But hold on this is not possible on simple computer. Then how? Time to explore this.

Explore this:

That computer must have JDK (java development kit) which provides all necessary tools and libraries to write and execute java programs. Inside JDK, there is JVM (java virtual machine) which reads .class files (the converted .java file from compiler). JVM just understand how to read .class files, it takes help from JRE (Java run time enviroment) to execute .class file. Following pictures depicts this–

Image

If you understand set theory, then in that way- JDK is super set  of JRE; and JRE is super set of JVM.

Extra:

One of the most important interview question is why JVM is called virtual? Answer is JDK is a software bundle which contains JVM too. As there is no physical existence of JVM, hence it is virtual.

What Next:

Go in depth of  JVM internals, class loading and garbage collection. This will provide you solid base for writing efficient programs. Providing some links and books which will surely help you. All the very best.

  • JVM internals-

JVM architecture

JVM in concise way

  • Book on Garbage collection and more

Java Performance By Charlie Hunt, Binu John

Till then happy reading,

Nirbhaya Bhava!!

Problem Solving Approach: Production and Non Prod behavior is different

In this article:

  • You will able to answer how to debug issue when production code behavior is not as expected as in non-production.
  • You will able to learn some basic causes due to that same code base behave differently in production and non-production.
  • You will get idea how to approach for this kind of scenario.
  • This will help to resolve production issues in quick time to avoid SLA.

newidea

Scenario:

Some functionality which is working as expected in non-production environment, but does not work in production.

The background:

Some time we come to those situations when production code behavior on some functionality (which was part of current release) is not as expected. QA already gave sign off, UAT also performed their testing, RTP team followed proper migration document. But still that functionality does not work.  What got wrong? In production we can’t enable debug info to check everything. Then how to resolve the issue? How to make production code working as expected? Following may be the solution which are truly based on real time experiences.

Explore this:

Below are the sequential steps we usually follow. At any point of time when gap found stop the analysis and correct the migration document, follow the rest steps to make sure there is no more gap and go for release.

1.       Verify WAR file version

  • Check the WAR file version in production and non-prod where sign off was given.
  • Check the code base. If it is J2EE, verify if the java class is up-to-date in version control system.
  • Check if the above changes are in WAR file.

2.       Missing Property file/Database changes

  • Based on the project management, check the tracking document which contains all the changes required to implement particular changes.
  •  It may be design document or kind of solution approach document.

3.       Enable same level log in Non-prod

  • Based on observation found, sometime few developers write some part of code in debug/info logger level by mistake. And that may also cause this situation where same code base is working fine on non-prod but not in production.
  • To identify this, set the logger level to same level as in Production.
  • Usually Error level is enabled in production and in non-prod, it is debug.
  • Read this article for more info.

4.       Hit Node URL-

  • Production code base usually deployed on more than one server or node (may be same server with different JVM as well). So rather hitting direct URL (like google.com), hit node specific URL (like nodeid:8080).
  • If any node URL gives desired result that means there is some problem with server and/or deployment process. May be temp directory did not get cleared in case of web logic server. Clear the temp directory and restart the application.
  • Check whether startup parameter is in sync.

5.       Stop external scripting on specific page

  • To get the user experience and/or to make changes in site look and feel using campaigns, companies are using external scripts like Adobe TestNTarget etc.
  • These script runs on our application core data returned. Hence these can be also responsible for the behavior of production code base.
  • To make sure if external scripts are creating problem, just stop on specific pages where desired functionality is not in working in production.

What Next:

  • Just apply above tips; these are based on personal experiences. May be you came across different scenario. Then please share it.
  • In next article planning to share real-time tools to resolve production issue.

Till then Happy Reading,

Nirbhaya Bhava!!

Error when Starting Eclipse: 64 Bit on Windows 7- 64 Bit

Problem: Error occurred while starting eclipse 64 Bit on Windows 64 bit.

Errors:

  1.  Failed to load the JNI shared library “C:/{some library}/bin/client/jvm.dll”`.
  2.  Java was started but returned exit code 13

Image

Quick Solution:

  1. Download correct JDK and eclipse for windows 64 bit.
  2. Make sure all installation goes in “Program Files”.
  3. Correct the PATH variable for Java.
  4. Update the eclipse.ini for following value-

    -vm
    {path_to_64bit_java}\bin\javaw.exe

Explanation and Points to Remember:

  1. Eclipse looks for working and compatible java, which is specified above in step 4.
  2. Remember, irrespective of OS bits, eclipse and JDK must be of same bits only. Obviously OS bit should be higher or equal to JDK and eclipse as well. For example: You can not set up 64 bit eclipse on 64 bit OS using 32 bit JDK.
  3. Download JDK6 (or the JDK version on which you want to run) from JDK download. Select top most hyper link- at this time it is Java SE Development Kit 6u45. Accept the license agreement click to download 64 bit JDK as marked in red—Image
  4.  As well concern to the folder name convention, X86 is representation for 32 bit windows.
  5. While installing JDK make sure default installation directory is “Program Files” rather “Program Files (x86)”.  And if it is not you have downloaded 32 bit JDK.

Hope this solves the problem. If not just let me know your problem through comments or mail.

Happy Reading.

Nirbhaya Bhava!!

Follow

Get every new post delivered to your Inbox.