Doc2Vec & Others

Lecture Review/DSBA

frances._.sb 2022. 3. 2. 13:26

728x90

고려대 강필성 교수님의 강의를 짧게 요약하였습니다.

sentence/paragraph/document-leveld에서 embedding을 보겠습니다.

[2015] Document Embedding

- Paragraph vectors are shared for all windows generated from the same paragraph, but not across paragraphs

Paragraph ID 는 항상 해당 단어 모델링할 때 같은 값을 가진다.

- Word vectors are shared across all paragraphs

- Ignore the context words in the input, and force the model to predict words randomly sampled from the paragrah in the output

PV-DM모델은 다음 단어가 뭐가 올지에 대해 예측하는 반면에, 위 모델은 ramdomly sampled 되어도 상관이없다.

- 입력을 사용할 때 wordvector를 필요로 하지 않는다.

- PV-DM alone usually works well for most task, but the combination of PV-DM and PV-DBOW are recommended

[2016+] Let's Embed Everything!

728x90

[2014] Seq2Seq Learning with Neural Networks (0)	2022.03.16
Topic Modeling - 2 (0)	2022.03.16
Topic Modeling - 1 (0)	2022.03.08
Dimensionality Reduction (2)	2022.03.04
NNLM/Word2Vec/GloVe/FastText (0)	2022.03.02

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

Subeen lab