Graphs

April 29, 2016May 1, 2016 / Chu Rong / Leave a comment

Graph’s Representation:

To solve graph problems, we should firstly understand how to store. We also need to figure out that in what data structure we store the graph will help us solve the problem. The first step to solve graph problem is to properly store the graph in a correct data structure. Generally, there are four ways to represent graphs. Each with its advantages and disadvantages.

Nodes as objects and edges as pointers.

Most problems about (binary) trees usually use this pattern.

Time complexity: O(V)

Memory complexity: O(V)

A list of edges represented by two nodes.

For Example,

[ [0,1], [0,6], [0,8], [1,4], [1,6], [1,9], [2,4], [2,6], [3,4], [3,5],
[3,8], [4,5], [4,9], [7,8], [7,9] ]

Graph problems in general are likely use this pattern. Sometimes, we would like to convert it to an adjacent list.

Time complexity: O(E)

Memory complexity: O(E)

Adjacency lists

Representing a graph with adjacency lists combines adjacency matrices with edge lists.

Time complexity:

Memory complexity:

Adjacency Matrices

from node in column to node in row, if there is an edge, represent by 1 or not by 0.

adjacency matrices

Time complexity:

Memory complexity:

Reference

[1] KHANACADEMY

Why string concatenation is more expensive than using stringBuilder?

April 16, 2016April 16, 2016 / Chu Rong / Leave a comment

When we use string concatenation to chain other objects, we actually create a new object of string each time we concatenate a new object. Which increase the memory consumption. On the other hand, stringBuilder solves this problem, it append new characters to one object each time. We can get the string object when the task is finished.

Hash Table

April 14, 2016April 19, 2016 / Chu Rong / Leave a comment

Learn from these links:

http://stackoverflow.com/questions/730620/how-does-a-hash-table-work

https://en.wikipedia.org/wiki/Hash_table

http://www.cs.cornell.edu/courses/cs3110/2011sp/lectures/lec20-amortized/amortized.htm

What is hash algorithm (hash function)?

A hash function takes a group of characters (called a key) and maps it to a value of a certain length (called a hash value or hash). The hash value is representative of the original string of characters, but is normally smaller than the original.

Hashing is done for indexing and locating items in databases because it is easier to find the shorter hash value than the longer string. Hashing is also used in encryption.

This term is also known as a hashing algorithm or message digest function.

Hashing is used with a database to enable items to be retrieved more quickly. Hashing can also be used in the encryption and decryption of digital signatures. The hash function transforms the digital signature, then both the hash value and signature are sent to the receiver. The receiver uses the same hash function to generate the hash value and then compares it to that received with the message. If the hash values are the same, it is likely that the message was transmitted without errors.

One example of a hash function is called folding. This takes an original value, divides it into several parts, then adds the parts and uses the last four remaining digits as the hashed value or key.

Another example is called digit rearrangement. This takes the digits in certain positions of the original value, such as the third and sixth numbers, and reverses their order. It then uses the number left over as the hashed value.

It is nearly impossible to determine the original number based on a hashed value, unless the algorithm that was used is known.

How is a hash table stored in memory?

A hash table is stored in the memory as an array of buckets. Each bucket has a unique hashcode. each entry (key value pair) will store in a bucket based on the hash value (equal to the hashcode of the target bucket) calculated based on the key. Shows in the picture below.

630px-Hash_table_3_1_1_0_1_0_0_SP

Why there is a collision in hash table?

Because the hash value is representative of the original string of characters, but the range of hash values is normally smaller than the original. values after hashing function may result in a same value. Then a collision happens.

How to deal with collision?

When a collision happens in a bucket, we can chain the entries as a linked list, as showed in the picture above. There are also many other data structures that can be used to chain entries in a bucket. For example, we can use balanced binary tree. In this case, the time complexity will be improved, but the space complexity will be increased.

The size of the bucket in a hash table is pre-defined, which decides how many entries that a bucket can hold. if the array becomes too full, we will just resizing the array. There many resizing ways. One is to double the size of array. recompute the hashcodes for existing keys (due to the size of array is changed), copy them to the new array, delete the old one.

This operation is not constant time, but rather linear in the number of elements at the time the table is grown.

What is the Time and Space complexity for using hash table?

As for linked list chaining method.

Time complexity:

average is O(1).

Because in most cases, the collision would not happen frequently, we can get the value for get the hascode of its key in constant time. We say that the insertion operation has O(1) amortized run time because the time required to insert an element is O(1) on average, even though some elements trigger a lengthy rehashing of all the elements of the hash table.

worst would be O(n).

Because, in worst case, all entries are stored in a bucket which means every time a new entry create will have a collision happen. In this case, to access or search a entry in hash table would be the same as linked list.

Space complexity:

O(n).

Because the size of space will grow linearly as the number of entries increasing.

Array

April 14, 2016 / Chu Rong / Leave a comment

Learn from these links:

https://en.wikipedia.org/wiki/Array_data_structure

What is Array?

An array is a data structure consisting of a collection of elements, each identified by one index or key.

In most cases, the indexes of an array is a sequence of continuous integer numbers, starting from 0 or 1. As what we’ve seen in most programming languages like, C, JAVA, Python, JavaScript, etc.

How are arrays stored in Memory?

Basically, An array is stored in a sequence of continuous addresses in memory. How much space each element occupies depends on what type of the element is in the array.

What is time and space complexity of using arrays?

Space complexity: O(n);

As arrays are stored in a continuous spaces, and memory is just a series of continuous spaces. We just need k*n space to store an array, n is the length of an array, k is the size of each element of the array.

Time complexity:

Access: O(1)

Because array is stored in a continuous space address. For finding any element in an defined array, we just need to calculate the address of the wanted element based on the starting address and the index of the element. Thus the time for getting an element in an array is constant.

Search: O(n)

To search a value in an array, we need to traversal the whole array.

Keep Learning

Data Structure

Graphs

Graph’s Representation:

Reference

Why string concatenation is more expensive than using stringBuilder?

Hash Table

What is hash algorithm (hash function)?

How is a hash table stored in memory?

Why there is a collision in hash table?

How to deal with collision?

What is the Time and Space complexity for using hash table?

Array

What is Array?

How are arrays stored in Memory?

What is time and space complexity of using arrays?

	Distributed System S… on Hadoop
	Array \| Pure technic… on Memory basics