![]() Kick-off |
![]() Sweet Home Country Grammar |
![]() Mei-Lwun |
![]() Announcements |
![]() Go over names |
![]() Go over assignment 01 |
![]() Survey |
![]() New Material |
![]() Talked about a single state markov model |
![]() Talked about probabilities and calculating probabilities |
![]() Smoothing |
![]() Training a model is based on generalizing over a set of training instances |
![]() memorizing examples is consistent but doesn't generalize |
![]() Occam's razor "all other things being equal, the simplest consistent explanation is best" |
![]() P(feature|class) = n_featureandclass/n_class |
![]() Since you rarely have all the training instances you must account for those you do not have |
![]() a rare feature |
![]() avoid zeroes in the probability distributions |
![]() Smoothing is one way to do this. |
![]() LaPlace smoothing using an m-esitmate assumes that each feature is given a prior probability,p, that is assumed to have been previously observed in a "virtual" sample of size m. |
![]() P(feature|class) = n_featureandclass + mp /( n_class + m) |
![]() For binary features, p is assumed to by 0.5. |
![]() This is equivalent of seeing each word in a category once. |
![]() Do an example |
![]() Using a Markov model generatively |
![]() Start with probability estimates |
![]() and a good random number generator |
![]() walk forward through the world |
![]() Assignment 02 |
![]() Data set is now available |
![]() only 2 overlaps |
![]() ./ACTION/Leon_JiShengyue.txt./ACTION/PirateOfTheCaribbean_SinhaPinaki.txt ./ACTION/PulpFiction_NasrRamzi.txt ./ACTION/PulpFiction_SutterNathan.txt ./ACTION/Terminator_BichutskiyVadim.txt ./ACTION/TheBourneIdentity_JiShengyue.txt ./COMEDY/AmericanSplendor_VernicaRares.txt ./COMEDY/DogDayAfternoon_VernicaRares.txt ./COMEDY/DumbAndDumber_DesaiChaitanya.txt ./COMEDY/DumbAndDumber_PirzadehPouria.txt ./COMEDY/Election_TikuZubin.txt ./COMEDY/Friends1_PirzadehPouria.txt ./COMEDY/Friends2_PirzadehPouria.txt ./COMEDY/Friends3_PirzadehPouria.txt ./COMEDY/Friends4_PirzadehPouria.txt ./COMEDY/TheresSomethingAboutMary_SaprooSameer.txt ./DRAMA/Braveheart_PartidaAugusto.txt ./DRAMA/Brick_JavanmardiSara.txt ./DRAMA/FightClub_NasrRamzi.txt ./DRAMA/GodFather_HabibiAmir.txt ./DRAMA/GodFatherII_HabibiAmir.txt ./DRAMA/Rocky_LinsteadEric.txt ./DRAMA/SixthSense_AlmishariMishari.txt ./DRAMA/TheQueen_JavanmardiSara.txt ./DRAMA/TheSting_LinsteadEric.txt ./DRAMA/TrainingDay_AlmishariMishari.txt ./FAMILY/FatherOfTheBride_SinhaPinaki.txt ./FAMILY/Shrek_BauTien.txt ./FANTASY/AI_SaprooSameer.txt ./FANTASY/Dune_TikuZubin.txt ./FANTASY/LOTR_ROTK_PartidaAugusto.txt ./FANTASY/ReturnOfTheJedi_BauTien.txt ./MISC/Pi_SutterNathan.txt ./MISC/PrettyWoman_BichutskiyVadim.txt ./MISC/TheMummy_DesaiChaitanya.txt |
![]() Break |
![]() Face transformer |
![]() More new material |
![]() Classifying a sequence |
![]() Probability calculation |
![]() lots of multiplications of small number goes under zero |
![]() Underflow prevention |
![]() log(P(a)*P(b)) =log(Pa)+log(Pb) |
![]() class with highest final unnormalized log probability score is the most probable |
![]() Bring this back to classifying a movie genre |