首页 > 解决方案 > 不重复字符的最长子串

问题描述

我一直在最长的子字符串上花费数小时而不重复字符 - LeetCode

  1. 不重复字符的最长子串

中等的

给定一个字符串,找出最长的不重复字符的子串的长度。

示例 1:

Input: "abcabcbb"
Output: 3 
Explanation: The answer is "abc", with the length of 3. 

示例 2:

Input: "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

示例 3:

Input: "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3. 
             Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

可以使用两个指针混合 kadane 的算法来操作子数组来解决该问题

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        logging.debug(f"{list(enumerate(s))}")
        slo = fas = 0  #slow as the fisrt character in a subarray which not duplicate, fast as the fast.
                                  #relation: length = fas - slo
        current = set()
        glo = loc = 0
        while fas < len(s):
            logging.debug(f"pre_current: {current}, slow: {slo}, fast: {fas}")
            if s[fas] not in current: 
                current.add(s[fas]
                loc = fas - slo
                glo = max(glo, loc)
                 fas +=1
            else:
                current.remove(s[slo])
                slo += 1
            logging.debug(f"post_current: {current}, slow: {slo}, fast: {fas} \n")
        return glo

测试用例

    def test_g(self):
        s = "abccefg"
        answer = 4
        check = self.solution.lengthOfLongestSubstring(s)
        self.assertEqual(answer, check)

解决方案很清楚慢速和快速交替移动

$ python 3.LongestSubstring.py MyCase.test_g
DEBUG [(0, 'a'), (1, 'b'), (2, 'c'), (3, 'c'), (4, 'e'), (5, 'f'), (6, 'g')]
DEBUG pre_current: set(), slow: 0, fast: 0
DEBUG post_current: {'a'}, slow: 0, fast: 1 

DEBUG pre_current: {'a'}, slow: 0, fast: 1
DEBUG post_current: {'b', 'a'}, slow: 0, fast: 2 

DEBUG pre_current: {'b', 'a'}, slow: 0, fast: 2
DEBUG post_current: {'b', 'c', 'a'}, slow: 0, fast: 3 

DEBUG pre_current: {'b', 'c', 'a'}, slow: 0, fast: 3
DEBUG post_current: {'b', 'c'}, slow: 1, fast: 3 

DEBUG pre_current: {'b', 'c'}, slow: 1, fast: 3
DEBUG post_current: {'c'}, slow: 2, fast: 3 

DEBUG pre_current: {'c'}, slow: 2, fast: 3
DEBUG post_current: set(), slow: 3, fast: 3 

DEBUG pre_current: set(), slow: 3, fast: 3
DEBUG post_current: {'c'}, slow: 3, fast: 4 

DEBUG pre_current: {'c'}, slow: 3, fast: 4
DEBUG post_current: {'c', 'e'}, slow: 3, fast: 5 

DEBUG pre_current: {'c', 'e'}, slow: 3, fast: 5
DEBUG post_current: {'e', 'f', 'c'}, slow: 3, fast: 6 

DEBUG pre_current: {'e', 'f', 'c'}, slow: 3, fast: 6
DEBUG post_current: {'g', 'e', 'f', 'c'}, slow: 3, fast: 7 

.
----------------------------------------------------------------------
Ran 1 test in 0.001s

作为结论,该解决方案采用了两个指针技术和 Kadane 算法的思想。我认为在作为初学者花费数小时调试之后,最终有可能解决它。

但是,我阅读了如此微妙的解决方案


class SolutionA:
    def lengthOfLongestSubstring(self, s):
        """
        :type s: str
        :rtype: int
        """
        #slow is the first  which not duplicate in a subarray
        #fast is the last whichi not duplicate in a subarray
        lookup, glo, slo, fas = {}, 0, 0, 0
        for fas, ch in enumerate(s):
            if ch in lookup: 
                slo = max(slo, lookup[ch]+1)
            elif ch not in lookup:
                glo = max(glo, fas-slo+1)                
            lookup[ch] = fas #update the duplicates and add new 
        return glo

该解决方案非常聪明,老实说,如果以前没有阅读过它,我真的不相信一个人可以在几个小时内设计出这样的解决方案。

它使用了hash map,是kadane算法思想的两倍,结构非常简洁。

它是两个指针的常用技术吗?它叫什么名字

标签: pythonalgorithmhashmap

解决方案


如第二种方法中解决方案的评论中所述:

慢是第一个在子数组中不重复的

快速是子数组中不重复的最后一个

它使用 2 个指针来跟踪没有重复字符的窗口大小。如果找到重复项,它会相应地更新指针。

换句话说,它维护一个窗口并进一步滑动它们以查看它可以与non-repeating characters属性一起使用多长时间。所以,这种方法被称为滑动窗口技术

这对于只有 26 个字母字符的字符串可能看起来微不足道,但对于UTF-8类型的字符串非常有用。


推荐阅读