본문 바로가기

전체 글

(66)

[논문 리뷰] Learning Spatiotemporal Features with 3D Convolutional Networks Submit : Tran, Du. ICCV (2015) Paper : https://arxiv.org/pdf/1412.0767.pdf 아직 정제되지 않은 글입니다. 0. Abstract deep한 3차원의 conv network를 사용해 시공간적인 특징을 학습 3가지 특징 시공간적인 특징을 학습하기에 2D conv에 비해 3D conv가 더 적합 3D conv에서 3_3_3 conv kernel이 가장 좋은 성능을 보였다 C3D가 4개의 다른 벤치마크 중에서 가장 뛰어난 성능을 보였다 특징이 명확 단지 10차원에서 UCF101에서 52.8%의 정확도를 보였다. 빠른 conv 추론덕에 계산하기 효율이 좋다. 개념적으로 아주 심플하고 학습시키지 쉽다 1. Introduction video를 이해하는 것에서는..

[논문 리뷰] Multi channel CNN for Korean Sentiment Analysis Multi channel CNN for Korean Sentiment Analysis Submit : Kim, Min. HCLT (2018) Paper : https://www.researchgate.net/publication/329609677_Multi-channel_CNN_for_Korean_Sentiment_Analysis Code : 0. Abstract 한국어 문장의 형태소, 음절, 자소를 동시에 각자 다른 conv layer를 통과시키는 Multi-channel CNN 제안 오타 포함하는 구어체 문장들의 경우에 형태소 기반 CNN으로 추출할 수 없는 특징들을 음절, 자소에서 추출할 수 있다. 1. Introduction 비젼을 위해 고안된 CNN이지만, 이후 NLP에도 쓸모가 있다는 것이 증..

[논문 리뷰] ObamaNet: Photo realistic lip sync from text ObamaNet: Photo realistic lip sync from text Submit : Rithesh Kumar, Jose Sotelo, Kundan Kumar, Alexandre de Brebisson, Yoshua Bengio. arxiv(2017) Paper : https://arxiv.org/abs/1801.01442 Code : https://github.com/acvictor/Obama-Lip-Sync 0. Abstract text, audio -> video : higher dimensional signal lip motion에 대한 문제가 있다 -> 입 주의 부분을 어떻게 싱크로맞추냐 얼굴의 다른 부분 (눈, 머리, 윗입술, 백그라운드) 를 원래 있었던 비디오의 footage에서 ..

[논문 리뷰] Everybody Dance Now Submit: Caroline Chan, Shiry Ginosar, Tinghui Zhou, Alexei A. Efros. ICCV (2019) Paper: https://arxiv.org/abs/1808.07371 0. Summary "do as I do" motion transfer하는 간단한 method 제안 각 frame별 img2img translation source에서 pose ditection을 해 target에 mapping 얼굴은 GAN 사용하여 더 자연스럽게 1. Learning pix2pix 아키텍처를 가져와 문제에 맞게 customize Train conditional GAN 기반 이미지 -> pose estimation -> true image(dist) / pose estima..

이전 1 2 3 4 5 ··· 17 다음

티스토리툴바