poll_project_v1.0

Post Reply
User avatar
thoxans
Posts: 1350
Joined: Tue Dec 11, 2018 7:48 pm

poll_project_v1.0

Post by thoxans »

imported first spreadsheet today as a test. used the curtis' ballot from our 6th annual top 100 poll

spreadsheet_test.png
spreadsheet_test.png (109.34 KiB) Viewed 1654 times
---
Site Admin
Posts: 2136
Joined: Tue Dec 11, 2018 1:30 am

Post by --- »

Holy crap that's a good list
User avatar
thoxans
Posts: 1350
Joined: Tue Dec 11, 2018 7:48 pm

Post by thoxans »

while i brainstorm ways to structure the spreadsheets to ease their import into python, i'm also considering: simplifying ballot formatting to decrease rate of data entry errors, or otherwise using lower() to manually override and reformat all entries to consistent case sensitivity; and addressing other errors, e.g., misspellings, by figuring out how to find and address fuzzy duplicates
User avatar
flip
Posts: 3468
Joined: Thu Dec 13, 2018 7:07 am
Location: montreal

Post by flip »

this has probably occurred to you already, but if you're just trying to do a kind of python test to see if you can automate tallying, probably the easiest poll to start with is the favourite directors poll bure is running - with just a list of names (rather than titles, directors and years in various formats) and a simple scoring system, that would probably be easy to code for
User avatar
thoxans
Posts: 1350
Joined: Tue Dec 11, 2018 7:48 pm

Post by thoxans »

combined and merged two top ten test lists (first pic), while aggregate summing the duplicates. output list (second pic) is ordered alphabetically rn, rather than by point total, but i'll get to that later. there's ofc a lot more i'll need to account for once i get around to working with the ballots here, but i feel this is good foundational stuff

testlist.png
testlist.png (57.1 KiB) Viewed 1576 times


aggsum.png
aggsum.png (60.04 KiB) Viewed 1576 times
User avatar
thoxans
Posts: 1350
Joined: Tue Dec 11, 2018 7:48 pm

Post by thoxans »

aaand sorted

sorted.png
sorted.png (65.39 KiB) Viewed 1570 times
User avatar
thoxans
Posts: 1350
Joined: Tue Dec 11, 2018 7:48 pm

Post by thoxans »

today's to-do list: take this script out for a test drive on the ongoing fav dir poll data. gonna use the lists as they're currently posted. might disregard certain ballots based on formatting just to start, and simply to ease the perf of the script. then, once i have the similarly formatted lists tallied, i'll focus on solutions to those formatted more uniquely
User avatar
thoxans
Posts: 1350
Joined: Tue Dec 11, 2018 7:48 pm

Post by thoxans »

so here's what i have so far. just three or four ballots are missing (wba's obvs; but also the ballots with numbered entries). cut down on some case sensitivity with str.lower(), but there are still dupes from accents/diacritics/naming system dynamics. gonna work on that now. current unofficial incorrect incomplete results have been whited out below fwiw (spoiler isn't working)

favdirscriptsofar.png
favdirscriptsofar.png (43.32 KiB) Viewed 1538 times



alfred hitchcock 29
buster keaton 21
akira kurosawa 18
chantal akerman 16
jacques rivette 13
mikio naruse 12
stanley kubrick 12
josef von sternberg 12
sergei parajanov 12
robert bresson 12
federico fellini 12
jean epstein 11
ingmar bergman 11
hou hsiao-hsien 11
eric rohmer 11
fritz lang 11
ritwik ghatak 10
hiroshi shimizu 9
rainer werner fassbinder 9
howard hawks 9
ernst lubitsch 9
john ford 9
werner herzog 9
yasujiro ozu 9
carl theodor dreyer 9
orson welles 9
jonas mekas 8
jacques tourneur 8
james benning 8
joao cesar monteiro 8
douglas sirk 8
kenji mizoguchi 8
mani kaul 8
michelangelo antonioni 8
nicholas ray 8
raul ruiz 8
sergio leone 8
joão césar monteiro 8
éric rohmer 8
andrei tarkovsky 8
coen brothers 8
louis feuillade 7
william wyler 7
abbas kiarostami 7
king hu 7
martin scorsese 7
peter greenaway 7
pier paolo pasolini 7
alain resnais 7
jia zhangke 7
edward yang 7
kelly reichardt 6
frederick wiseman 6
woody allen 6
luis buñuel 6
frank borzage 6
maurice pialat 6
johnnie to 6
sam fuller 6
billy wilder 6
john cassavetes 6
david fincher 5
julio bracho 5
miyazaki hayao 5
charles m. jones 5
kurosawa akira 5
luis bunuel 5
bahram beizai 5
marguerite duras 5
david lynch 5
ken russell 5
govindan aravindan 5
wim wenders 5
aki kaurismaki 5
jean-luc godard 5
robert parrish 5
françois truffaut 5
edgar ulmer 5
richard linklater 5
kobayashi masaki 4
toyoda toshiaki 4
shinkai makoto 4
russell rouse 4
arturo ripstein 4
alexander dovzhenko 4
zoltan fabri 4
noah baumbach 4
xavier dolan 4
petr zelenka 4
kurt kren 4
lars von trier 4
peter hutton 4
mamoru ishii 4
andrzej zulawski 4
parviz kimiavi 4
ozu yasujirō 4
whit stillman 4
paul thomas anderson 4
dario argento 4
elio petri 4
erich von stroheim 4
hong sang-soo 4
frank capra 4
hal hartley 4
francis coppola 4
fyodor khitruk 4
alfred vohrer 3
pedro almodovar 3
bela tarr 3
paul schrader 3
charlie chaplin 3
paolo sorrentino 3
otar iosseliani 3
masaki kobayashi 3
olivier assayas 3
pedro costa 3
terry gilliam 3
teuvo tulio 3
monte hellman 3
george roy hill 3
tim burton 3
mike leigh 3
andrey zvyagintsev 3
michael mann 3
nuri bilge ceylan 3
sydney pollack 3
fred zinnemann 3
gregg araki 3
claire denis 3
christopher nolan 3
roy andersson 3
roberto rossellini 3
shunji iwai 3
sidney lumet 3
alain robbe-grillet 3
ridley scott 3
francis ford coppola 3
rene cardona jr 3
raúl ruiz 3
franco piavoli 3
raoul ruiz 3
agnès varda 3
steven soderbergh 3
preston sturges 3
steven spielberg 3
peter weir 3
peter watkins 3
max ophüls 3
aki kaurismäki 3
sam mendes 3
edgar g ulmer 3
wong kar-wai 3
artavazd peleshyan 3
abel ferrara 3
jacques becker 3
yasujirô ozu 3
john waters 3
atom egoyan 3
jacques tati 3
yasujirō ozu 3
bill forsyth 3
joachim trier 3
yorgos lanthimos 3
yoshishige yoshida 3
zack snyder 3
bennett miller 3
zhangke jia 3
jean rollin 3
jean renoir 3
david lean 3
guy gilles 3
wong kar wai 3
kenneth anger 3
tsai ming-liang 3
guy maddin 3
margaret tait 3
claude chabrol 3
manoel de oliveira 3
hark tsui 3
harun farocki 3
hayao miyazaki 3
kim jee-woon 3
david cronenberg 3
lino brocka 3
leonardo favio 3
heinosuke gosho 3
wang bing 3
hirokazu kore-eda 3
agnes varda 3
koreeda hirokazu 3
werner schroeter 3
tenghiz abuladze 2
yuri ilyenko 2
shohei imamura 2
don hertzfeldt 2
jean cocteau 2
friedrich wilhelm murnau 2
a. tarkovsky 2
františek vláčil 2
king vidor 2
f.w. murnau 2
jacques demy 2
liu chia-liang 2
germaine dulac 2
nagisa oshima 2
jesus franco 2
alice guy 2
jerzy skolimowski 2
jean-marc vallée 2
alfonso cuaron 2
francesco rosi 2
robert siodmak 2
s. eisenstein 2
jean-jacques beineix 2
jean gremillon 2
celine sciamma 1
jafar panahi 1
joe may 1
iwai shunji 1
djibril diop mambéty 1
isao takahata 1
satyajit ray 1
chang cheh 1
anthony mann 1
charles chaplin 1
emilio fernandez 1
henry king 1
marcel l'herbier 1
thom eberhardt 1
gabor body 1
francois truffaut 1
r w fassbinder 1
saeed akhtar mirza 1
sai paranjpye 1
krzysztof kieslowski 1



sidenote: if the curtis rather i didn't post results, then just let me know!
---
Site Admin
Posts: 2136
Joined: Tue Dec 11, 2018 1:30 am

Post by --- »

There is simply no way Rohmer is that low
User avatar
thoxans
Posts: 1350
Joined: Tue Dec 11, 2018 7:48 pm

Post by thoxans »

i blame the é
User avatar
thoxans
Posts: 1350
Joined: Tue Dec 11, 2018 7:48 pm

Post by thoxans »

eliminated those pesky accents/diacritics. now just need to account for the fuzzy dupes, due to inconsistent adherence to naming systems, abbreviations, misspellings, etc. there's undoubtedly unnecessarily repeated lines of code here, but whatevah. i'll worry about cleaning it up, and making it look pretty, once i actually get the program to do what i want

favdirdecoded.png
favdirdecoded.png (104.44 KiB) Viewed 1493 times
---
Site Admin
Posts: 2136
Joined: Tue Dec 11, 2018 1:30 am

Post by --- »

hm interesting. hate to break it to you, but that's not quite it! i think 9 of the top 10 are correct, although not always in the order you have
User avatar
thoxans
Posts: 1350
Joined: Tue Dec 11, 2018 7:48 pm

Post by thoxans »

oh, those aren't the final results! still haven't included aforementioned awkwardly formatted ballots, or even mine own (not to mention dem fuzzy dupes)

just running test scripts until i solve one problem, before moving onto the next
User avatar
thoxans
Posts: 1350
Joined: Tue Dec 11, 2018 7:48 pm

Post by thoxans »

stripped punctuation and white space, further parsing the data, and thus altering the tally again. gonna work on the fuzzy stuff now. how's it lookin, curtis?


favdirstrip.png
favdirstrip.png (113.7 KiB) Viewed 1459 times
Post Reply