nlp - 我将每个单词转换为 96x1 数字向量而不是 300x1（我正在使用 nlp=spacy.load('en',vectors='en_glove_cc_300_1m

问题描述

    def answer_embedding(text):
      print ("\n Creating answer embeddings...")
      nlp=spacy.load('en',vectors='en_glove_cc_300_1m_vectors')
      text=text.lower()
      docx=nlp(text)
    
      print([token.text for token in docx])
      print("text length:",len(docx))
      for i in range (len(docx)):
        answer_feature.append(docx[i].vector)
        #print(docx[i].vector,"\n")
      answer_feature_set.append(answer_feature)
      print("ans features shape: ",len(answer_feature),len(answer_feature[0]))

fact_feature_set=[];
question_feature_set=[]
answer_feature_set=[]

for i in range (len(triplet_list[0:1])): 

  answer_feature=[]
  answer_embedding(triplet_list[i][2])
  print("answer_feature_set",i)
  print(answer_feature_set[i])
  #answer_feature_set[i] = pad_sequences(answer_feature_set[i], maxlen=max_length, padding='post')

------------------代码输出------------------

创建答案嵌入...

['trumpet'] 文字长度：1 ans features shape：1 96

[array([ 2.4779391 , -1.6003039 , -0.9587759 , -1.699791 , 2.3693838 , 2.645733 , 4.481431 , -0.2480979 , -0.10860936, 3.5003755 , 2.92377 , -0.3663767 , 0.9723637 , 1.8907986 , 0.78999585, -1.01864 , 0.04050708, 1.5967814 , -1.5551249 , -0.927932 , -0.24869117, -0.23154023, -1.4431062 , -1.4001569 , 0.6366439 , 0.11352289, 0.04570532, -3.0624876 , 2.3406787 , -2.4995136 , 1.5955511 , -0.45652902, -0.85962564, 2.4930985 , 2.3721867 , -1.9983075 , 3.872029 , -0.13894784, - 1.5572866 , -1.4099059 , 3.8904212 , 2.8950882 , -2.0232072 , -3.0111172 , -0.52683604, -0.28643116, 0.15628016, -1.3392415 , 0.13402215, 2.2797725 , -0.4048978 , -1.4552156 , -2.9745393 , -3.004692 , -1.3241203 , -0.09389547, 2.4897652 , 2.1278718，-0.5398587，-0.43278995，0.40643644，-1.5307406，-1.1823719，0.69182336，1.1646341，-2.3956645，1.591285 , -2.3812287 , -2.0568147 , 2.0366228 , -1.5198123 , 2.6346998 , 3.2309058 , 2.8691058 , -2.175885 , -1.9422603 , 0.40466923, -0.6050789 , -1.2262808 , 2.7045658 , 1.2141304 , -2.1289878 , -3.3201828 , -0.41113475, 1.0044719 , 1.9359473 , -1.6282874、1.155003、-0.9112543、-3.0722218、-0.87314045、0.44169876、-2.2623038、2.1645956、0.88755935、-0.4172356]、dtype=flo]

标签： nlpstanford-nlp

nlp - 我将每个单词转换为 96x1 数字向量而不是 300x1（我正在使用 nlp=spacy.load('en',vectors='en_glove_cc_300_1m_vectors')

问题描述

解决方案

推荐阅读