首页 > 技术文章 > HashMap的put和get原理,结合源码分析详细分析

houstao 2018-01-12 00:13 原文

HashMap(java7)

   public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable {} 

  HashMap的数据结构是数组+链表,从上面的源码可以看出来,hashMap继承了AbstractMap<K,V>的抽象类,实现了Map<K,V>的接口。

一、put方法:

  Java源码:

 1 /**
 2  * Hash table based implementation of the <tt>Map</tt> interface.  This
 3  * implementation provides all of the optional map operations, and permits
 4  * <tt>null</tt> values and the <tt>null</tt> key.  (The <tt>HashMap</tt>
 5  * class is roughly equivalent to <tt>Hashtable</tt>, except that it is
 6  * unsynchronized and permits nulls.)  This class makes no guarantees as to
 7  * the order of the map; in particular, it does not guarantee that the order
 8  * will remain constant over time.
 9  *
10  * <p>This implementation provides constant-time performance for the basic
11  * operations (<tt>get</tt> and <tt>put</tt>), assuming the hash function
12  * disperses the elements properly among the buckets.  Iteration over
13  * collection views requires time proportional to the "capacity" of the
14  * <tt>HashMap</tt> instance (the number of buckets) plus its size (the number
15  * of key-value mappings).  Thus, it's very important not to set the initial
16  * capacity too high (or the load factor too low) if iteration performance is
17  * important.
18  *
19  * <p>An instance of <tt>HashMap</tt> has two parameters that affect its
20  * performance: <i>initial capacity</i> and <i>load factor</i>.  The
21  * <i>capacity</i> is the number of buckets in the hash table, and the initial
22  * capacity is simply the capacity at the time the hash table is created.  The
23  * <i>load factor</i> is a measure of how full the hash table is allowed to
24  * get before its capacity is automatically increased.  When the number of
25  * entries in the hash table exceeds the product of the load factor and the
26  * current capacity, the hash table is <i>rehashed</i> (that is, internal data
27  * structures are rebuilt) so that the hash table has approximately twice the
28  * number of buckets.
29  *
30  * <p>As a general rule, the default load factor (.75) offers a good tradeoff
31  * between time and space costs.  Higher values decrease the space overhead
32  * but increase the lookup cost (reflected in most of the operations of the
33  * <tt>HashMap</tt> class, including <tt>get</tt> and <tt>put</tt>).  The
34  * expected number of entries in the map and its load factor should be taken
35  * into account when setting its initial capacity, so as to minimize the
36  * number of rehash operations.  If the initial capacity is greater
37  * than the maximum number of entries divided by the load factor, no
38  * rehash operations will ever occur.
39  *
40  * <p>If many mappings are to be stored in a <tt>HashMap</tt> instance,
41  * creating it with a sufficiently large capacity will allow the mappings to
42  * be stored more efficiently than letting it perform automatic rehashing as
43  * needed to grow the table.
44  *
45  * <p><strong>Note that this implementation is not synchronized.</strong>
46  * If multiple threads access a hash map concurrently, and at least one of
47  * the threads modifies the map structurally, it <i>must</i> be
48  * synchronized externally.  (A structural modification is any operation
49  * that adds or deletes one or more mappings; merely changing the value
50  * associated with a key that an instance already contains is not a
51  * structural modification.)  This is typically accomplished by
52  * synchronizing on some object that naturally encapsulates the map.
53  */

  然后我们再看一下,HashMap的组成,主要的是实现方法为put和get方法,主要的参数有capacity(桶的容量)和load factor(加载因子)。上面是官网的描述,大致意思为:

  1、HashMap实现了Map的接口,<K,V>对中的key和value都可以为空,除了不是线程安全的和允许为null外,几乎是与HashTable一样的。同时也不能保证其的顺序。

  2、一个hashmap对象主要有两个参数capacity(桶的容量)和load factor(加载因子),默认load factor为0.75是在时间和空间上性能最优的。

1  /**
2      * Constructs an empty <tt>HashMap</tt> with the default initial capacity
3      * (16) and the default load factor (0.75).
4      */
5     public HashMap() {
6         this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
7     }

  上面的是不带参数的HashMap的构造器,也是我们创建对象的时候经常使用的,默认数组或者桶的容量为16,加载因子为0.75。

 1    /**
 2      * Constructs an empty <tt>HashMap</tt> with the specified initial
 3      * capacity and the default load factor (0.75).
 4      *
 5      * @param  initialCapacity the initial capacity.
 6      * @throws IllegalArgumentException if the initial capacity is negative.
 7      */
 8     public HashMap(int initialCapacity) {
 9         this(initialCapacity, DEFAULT_LOAD_FACTOR);
10     }

  我们也可以自定义数组的容量,加载因子默认为0.75。

/**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and load factor.
     *
     * @param  initialCapacity the initial capacity
     * @param  loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);

        this.loadFactor = loadFactor;
        threshold = initialCapacity;
        init();
    }

  同时也可以既修改容量有修改加载因子,但是最好不要修改。

 1 /**
 2      * Associates the specified value with the specified key in this map.
 3      * If the map previously contained a mapping for the key, the old
 4      * value is replaced.
 5      *
 6      * @param key key with which the specified value is to be associated
 7      * @param value value to be associated with the specified key
 8      * @return the previous value associated with <tt>key</tt>, or
 9      *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
10      *         (A <tt>null</tt> return can also indicate that the map
11      *         previously associated <tt>null</tt> with <tt>key</tt>.)
12      */
13     public V put(K key, V value) {
14         if (table == EMPTY_TABLE) {
15             inflateTable(threshold);
16         }
17         if (key == null)
18             return putForNullKey(value);
19         int hash = hash(key);
20         int i = indexFor(hash, table.length);
21         for (Entry<K,V> e = table[i]; e != null; e = e.next) {
22             Object k;
23             if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
24                 V oldValue = e.value;
25                 e.value = value;
26                 e.recordAccess(this);
27                 return oldValue;
28             }
29         }
30 
31         modCount++;//
32         addEntry(hash, key, value, i);
33         return null;
34     }
  
 1  /**
 2      * Offloaded version of get() to look up null keys.  Null keys map
 3      * to index 0.  This null case is split out into separate methods
 4      * for the sake of performance in the two most commonly used
 5      * operations (get and put), but incorporated with conditionals in
 6      * others.
 7      */
 8     private V getForNullKey() {
 9         if (size == 0) {
10             return null;
11         }
12         for (Entry<K,V> e = table[0]; e != null; e = e.next) {
13             if (e.key == null)
14                 return e.value;
15         }
16         return null;
17     }
1     /**
2      * Returns index for hash code h.返回hashCode值
3      */
4     static int indexFor(int h, int length) {
5         // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
6         return h & (length-1);
7

解析:首先判断table,也就是数组是否为空,为空的话就去使用inflateTable的方法(这里不多解释)初始化hashmap。

   如果table不为空的话,就判断key是否为空,为空的话就将放到数组的index=0的位置,如果value不为空则返回value值。

   如果key不为空的话,就通过key获取hash值,通过hash值和table的长度与运算获取hashCode值。

   通过hashCode的遍历entry<K,V>的键值对,如果key的hash值相等 并且key.equals(e.key)也相等的话,就将新的value替换掉旧的,返回旧值。

   在put的过程中modCount记录修改的次数,用于fail-fast容错。

  那么会有人问了,为什么table.length的默认长度为16,而不是15或者14呢?

  首先,通过计算你就会发现hash取模使用&比使用%,得到的hash会更均匀,并且性能更好。例如:table.length=15

  h&(length-1)                                            hash                                                     length - 1                        结果

  8 & (15-1)                                                 1000                                                        1110                                1000

  9 & (15-1)                                                 1001                                                         1110                                1000

  -------------------------------------------------------------------------------------------------------------------------------------------------------

  8 & (16-1)                                                 1000                                                         1111                                  1000  

  9 & (16-1)                                                  1001                                                         1111                                  1001

  从上面的结果你可以发现,当table.length=15的时候,会出现hash碰撞的现象,并且所有结果为最后一位1的永远都不会出现,既0001,1001,1011,0011

  0101,1111这几个位置都不会存放数据,会出现存放不均匀,并且浪费资源的现象。如果将8和9放到同一个位置当我们get的时候,要遍历链表,降低了查          询效率。

 1     /**
 2      * Adds a new entry with the specified key, value and hash code to
 3      * the specified bucket.  It is the responsibility of this
 4      * method to resize the table if appropriate.
 5      *
 6      * Subclass overrides this to alter the behavior of put method.
 7      */
 8     void addEntry(int hash, K key, V value, int bucketIndex) {
 9         if ((size >= threshold) && (null != table[bucketIndex])) {
10             resize(2 * table.length);
11             hash = (null != key) ? hash(key) : 0;
12             bucketIndex = indexFor(hash, table.length);
13         }
14 
15         createEntry(hash, key, value, bucketIndex);
16     }
17 
18     /**
19      * Like addEntry except that this version is used when creating entries
20      * as part of Map construction or "pseudo-construction" (cloning,
21      * deserialization).  This version needn't worry about resizing the table.
22      *
23      * Subclass overrides this to alter the behavior of HashMap(Map),
24      * clone, and readObject.
25      */
26     void createEntry(int hash, K key, V value, int bucketIndex) {
27         Entry<K,V> e = table[bucketIndex];
28         table[bucketIndex] = new Entry<>(hash, key, value, e);
29         size++;
30     }

  当我们在addEntry的时候,新的添加的entry都会放到第一个位置,并且指向下一个entry,每一个entry中存放一个key-value和next引用。

二、get方法:

  Java源码:

 1  /**
 2      * Returns the value to which the specified key is mapped,
 3      * or {@code null} if this map contains no mapping for the key.
 4      *
 5      * <p>More formally, if this map contains a mapping from a key
 6      * {@code k} to a value {@code v} such that {@code (key==null ? k==null :
 7      * key.equals(k))}, then this method returns {@code v}; otherwise
 8      * it returns {@code null}.  (There can be at most one such mapping.)
 9      *
10      * <p>A return value of {@code null} does not <i>necessarily</i>
11      * indicate that the map contains no mapping for the key; it's also
12      * possible that the map explicitly maps the key to {@code null}.
13      * The {@link #containsKey containsKey} operation may be used to
14      * distinguish these two cases.
15      *
16      * @see #put(Object, Object)
17      */
18     public V get(Object key) {
19         if (key == null)
20             return getForNullKey();
21         Entry<K,V> entry = getEntry(key);
22 
23         return null == entry ? null : entry.getValue();
24     }
25 
26     /**
27      * Offloaded version of get() to look up null keys.  Null keys map
28      * to index 0.  This null case is split out into separate methods
29      * for the sake of performance in the two most commonly used
30      * operations (get and put), but incorporated with conditionals in
31      * others.
32      */
33     private V getForNullKey() {
34         if (size == 0) {
35             return null;
36         }
37         for (Entry<K,V> e = table[0]; e != null; e = e.next) {
38             if (e.key == null)
39                 return e.value;
40         }
41         return null;
42     }
 1    /**
 2      * Returns the entry associated with the specified key in the
 3      * HashMap.  Returns null if the HashMap contains no mapping
 4      * for the key.
 5      */
 6     final Entry<K,V> getEntry(Object key) {
 7         if (size == 0) {
 8             return null;
 9         }
10 
11         int hash = (key == null) ? 0 : hash(key);
12         for (Entry<K,V> e = table[indexFor(hash, table.length)];
13              e != null;
14              e = e.next) {
15             Object k;
16             if (e.hash == hash &&
17                 ((k = e.key) == key || (key != null && key.equals(k))))
18                 return e;
19         }
20         return null;
21     }

解析:

  (1)get的时候,如果key==null,判断map的长度也为空的话,则返回null,如果map长度不为空,则e也不空,遍历table[0],返回e.value

  (2)getEntry的时候,首先要获取hash(key)的值,通过hash&table.length获取到的hashCode值得到entry在桶中存放的位置,判断如果如果传入的key的hash与要获得key的hash相等的话并且key.equals(e.key)y也相等,则返回entry,如果返回的jentry不为空的话,则getValue的值。

三、map中的元素不重复,元素无序

  测试:

1  Map map = new HashMap();
2         map.put("123","456");
3         map.put("abc","456");
4         map.put("dd","456");
5         map.put("ss","789");
6         Object s = map.put("123","777");
7         System.out.println(map);//key不可重复,重复的话会将原来的value值覆盖,但是value的值不是唯一的,每个key都有唯一的一个value与之对应
8         System.out.println(s);//返回旧的value值456

如有错误,请多多指出,谢谢!

  

 

推荐阅读