mixedstrategyplayingames∗
JasonShachat†J.ToddSwarthout‡
LijiaWei§July7,2012
Abstract
Weproposeastatisticalmodeltoassesswhetherindividualsstrategicallyusemixedstrategiesinrepeatedgames.WeformulateahiddenMarkovmodelinwhichthelatentstatespacecontainsbothpureandmixedstrategies,andallowsswitchingbetweenthesestates.WeapplythemodeltodatafromanexperimentinwhichhumansubjectsrepeatedlyplayanormalformgameagainstacomputerthatalwaysfollowsitspartoftheuniquemixedstrategyNashequilibriumprofile.Estimatedresultsshowsignificantmixedstrategyplayandnon-stationarydynamics.Wealsoexploretheabilityofthemodeltoforecastactionchoice.
JELclassification:C92;C72;C10
Keywords:MixedStrategy;NashEquilibrium;Experiment;HiddenMarkovModel
Thispapersupersedesthepreviousworkingpaper,“ManversusNash:Anexperimentontheselfenforcingnatureofmixedstrategyequilibrium.”†
WangYananInstituteforStudiesinEconomicsandMOEKeyLaboratoryinEconometrics,XiamenUniversity.jason.shachat@gmail.com‡
DepartmentofEconomicsandExperimentalEconomicsCenter,GeorgiaStateUniversity.swarthout@gsu.edu§
WangYananInstituteforStudiesinEconomicsandMOEKeyLaboratoryinEconometrics,XiamenUniversity.ljwie.wise@gmail.com
∗
1
1Introduction
GametheoryandtheNashequilibriumsolutionconceptareakeyframeworkinthesocialsciencesformodelinginteractivebehavior.Theformulationofanormalformgameconsistsofasetofplayers,asetofpossibleactionsforeachplayer,andapayofffunctionforeachplayerthatgivesareal-valuedpayoffforanypossiblejointactionprofile–alistofactionsconsistingofoneforeachplayer.ANashequilibriumisajointactionprofilesuchthateachplayer’sassignedactionresultsinatleastashighapayofftotheplayerasanyotherpossibleaction,assumingallotherplayerschoosetheirrespectiveactionsintheNashequilibriumprofile.Ifplayersarerestrictedtodeterministicallychooseanaction,thentherearemanygamesthatdon’thaveaNashequilibrium,suchasthechildhoodgameofRock,Scissors,Paper.Confrontedwiththisproblem,VonNeumann(1928)generalizedaplayer’sdecisionfromchoosinganactiontochoosingaprobabilitydistributionoverhispossibleactions.1Thischoiceofaprobabilitydistributioniscalleda“mixed”strategy,andadegeneratemixedstrategywhichchoosesaparticularactionwithprobabilityoneiscalleda“pure”strategy.Theintroductionofmixedstrategiesallowsforexistenceofequilibriumacrossabroadclassofgames:fromminimaxsolutionsforzero-sumgames(VonNeumann,1928;VonNeumannandMorgenstern,1944)tononcooperativeequilibriaforn-persongames(Nash,1951).Whiletheroleofmixedstrategiesindefininglogicallyconsistentsolutionconceptsisnotindoubt,thepositiveaspectofindividualsactuallyplayingmixedstrategiesisanopenquestionofconsiderableinterest.
Researchers’effortstoanswerthisquestionhavenaturallyfocusedonsettingswheretheuseofmixedstrategiesismostcompelling:therepeatedplayofgameswhichhaveauniquemixedstrategyNashequilibrium.Thevalueof“beingunpredictable”isreadilyseeninexamplessuchasservesintennis,“bluffing”inpoker,andwhetherornotataxauthorityauditsataxpayer.Acommonapproachinthisliteratureistotestwhethertheplayers’actionchoicesareconsistentwiththemixedstrategyequilibrium.Somestudiesusingcontrolledexperimentswithhumansubjectshavetheadvantageofknowingthepayofffunctions,andtestwhetherchoicefrequenciesagreewiththeequilibriumstrategiesandwhetherplayers’sequencesofactionsareseriallyindependent(O’Neill,1987;Binmore,Swierzbinski,andProulx,2001;MorganandSefton,2002;SeltenandChmura,2008).Otherstudiesconsiderhigh-levelsportscompetitions,suchassoccer(Chiappori,Levitt,andGroseclose,2002;Palacios-Huerta,2003;Bareli,Azar,Ritov,Keidarlevin,andSchein,2007)andtennis(WalkerandWooders,2001),withtheadvantageofstudyinghighlyexperiencedplayerscompetingforhighstakesand
Alongwithgeneralizingthesetoffeasibleactionstothesetofmixedstrategies,aplayer’spayofffunctionisextendedbysettingitsvaluetotheexpectedpayoffgivenaprofileofmixedstrategies,commonlyreferredtoastheexpectedutilityproperty.
1
2
thedisadvantageofunknownpayofffunctions.2Thesestudiesfocusontestingtheserialindependenceofactionchoiceandtheequilibriumimplicationofequalpayoffsacrossactionchoices.Someofthemostprominentandrecurringresultsforbothtypesofstudiesarethataggregatedactionfrequenciesacrossplayersagreewiththeequilibriummixedstrategiesbutindividualactionfrequenciesdonot,andformanyindividualsactionchoicesareseriallycorrelatedviolatingtheindependenceprediction.
Toreconciletheseissuesofserialcorrelationandheterogeneity,severalstudies(Ochs,1995;Bloomfield,1994;Shachat,2002;NoussairandWillinger,2011)conductlaboratoryexperi-mentsusingthesametypeofgamesbutdirectlyelicitmixedstrategiesbyobligatingplayerstoselectaprobabilitydistributionoveractions.3Elicitedstrategiesintheseexperimentsex-hibitvariousdistinctpatterns.Somesubjectschoosepurestrategiesalmostexclusively,somechoosestrictlymixedstrategiesalmostexclusively,andothersusebothtypesofstrategies–usuallyinlongsequences.Also,certainmixedstrategiesareoftenquitefocal,suchaschoosingequalprobabilityweightonasubsetofactionsratherthantheNashequilibriumproportions.Naiveinterpretationoftheseresultssuggestsacleardistinctionbetweenplaythatispurposelyunpredictableandplaythatisapurebestresponsetochangingforecastsofanopponent’saction(NyarkoandSchotter,2002).Amorecautiousinterpretationisthatsubjectsmayes-chewtherandomizingdeviceprovidedbytheexperimenterandinsteadinternallyrandomize,orperhapssubjectschoosestrictlymixedstrategiesduetotheexperimentereffectofthenovelelicitationmethod.Clearlyalessinvasivemethodtodetectmixedstrategyplaywouldbevaluable.
InthisstudyweproposeahiddenMarkovmodel(HMM)todetectwhetherobservedactionchoicesaretheresultofpureormixedstrategiesplayinrepeatedtwo-personfiniteactiongames.4Therearethreekeyideasinourformulation:(1)wetreatthestrategyaplayerfollowsasalatentstateandtheactionplayedastheobservableoutputfromthelatentstrategy;(2)thesetofpossiblelatentstatesisadiscretesubsetofallpossiblemixedstrategiescontainingpurestrategies,Nashequilibriumorminimaxstrategies,andfocalmixedstrategies;and(3)aplayerswitchesthelatentstrategyhefollowsaccordingtoafirstorderMarkovprocess.Wethendemonstratetheabilityofthemodelbyapplyingittoanewexperimentaldatasetwecollect.Inourexperiment,eachhumansubjectrepeatedlyplaysa2×2gameagainstacomputerplayerthatfollowsitsmixedstrategyequilibrium.Somesubjectsplayazero-Theactionsetsaretypicallycomprisedofsimpleactions,e.g.,{serveleft,serveright}and{defendleft,defendright}.Thepayoffsareassumedtobetheprobabilityofwinningthetaskandtheseprobabilitieswilldifferbaseduponboththecomparativeskillsbetweenplayersandtherelativestrengthsaplayerhasforeachaction.3
Forexample,Shachat(2002)adoptsagamewithfouractions,eachidentifiedbyadifferentcolor,foreachplayer.Eachplayermustfillaboxwith100cardsinanycombinationofthefourcoloredcardtypes,andthenonecardisselectedatrandomtodeterminetheactionplayed.4
SeeRabiner(19)foraclassicintroductiontohiddenMarkovmodels.
2
3
sumgameandothersanunprofitablegame.5TheestimatedHMMsrevealseveralinterestingresults,including:(1)significantamountsofbothpureandmixedstrategyplay;(2)thefocalequiprobablemixedstrategyisplayedmoreoftenthantheNashequilibriumstrategy;(3)lowtransitionprobabilitiesbetweenmixedandpurelatentstrategies;(4)dynamicadjustmentsinthetypesofstrategiesplayersfollowovertime;and(5)appreciableratesofbothmixedandpurestrategyplayinthelimitingdistributionsoftheHMMs(interpretedasthelongrunequilibriumofplay).WethenextendtheHMMfromastatisticalframeworkforevaluatinghypothesestooneforforecastingactionchoiceandassessitspredictiveaccuracy.
2AHMMofswitchingstrategies
ConsideranexperimentinwhichweobserveMpairsofsubjects,eachplayingTperiodsofthesame2×2normalformgame.Oftengameslikethisaredescribedbyatwo-by-twotable,andforfamiliaritypurposeswedenoteonesubject’splayerroleasRowandtheotherasColumn.Welabeleachplayerrole’stwopossibleactionsLeft(L)andRight(R),andexpressasubject’smixedstrategyastheprobabilityofplayingL.OfparticularinterestiswhenthegamehasasingleNashequilibriumanditisinstrictlymixedstrategies,althoughourframeworkisnotrestrictedtostudyonlysuchcases.Threefactorsconfoundingtheanalysisdatageneratedbythistypeofprocessarethelatencyofplayers’mixedstrategies,theheterogeneityofstrategyadoptionacrosssubjects,andvariationofadoptedlatentstrategiesoverthecourseofrepeatedplay.Inthissection,wepresentamodelthataccommodatesandallowsestimationoftheseconfounds.
ConsiderthefollowingHMMforafixedplayerrole.ThestatespaceSisann-elementsubsetofthesubjecti’spossiblemixedstrategies.Denotesi,t∈Sforthestrategyusedbysubjectiinperiodt,SiisthesetofallpossibleTelementsequencesofmixedstrategiesforiwithtypicalelementsi,andletsbethecollectionofsiforallMsubjectsinagivenplayerrole.Letyi,tdenotesubjecti’srealizedactioninperiodt,yiisthecorrespondingTelementsequenceofi’sobservableactions,andyisthecollectionofyiforallMsubjects.View{y,s}astheoutputoftheHMM.
TheprobabilitystructureoftheHMMhasthreeelements.First,then-elementvectorBforwhichtheelementBjistheprobabilityasubjectchoosesactionLeft,i.e.themixedstrategy,ifheisinstatej.WewillprovidetwoanalyseswhichdifferinhowwespecifyB.InoneapproachweconsiderBasknownapriori,andSandBareredundantnotation.Usually,inthisapproach,Bcontainsthetwopurestrategies,otherstrategiessuggestedbytheorysuch
AnunprofitablegameisoneinwhichtheminimaxandNashequilibriumsolutionsaredistinctbutyieldthesameexpectedpayoffforeachplayer.
5
4
asNashequilibriumorminimax,andotherfocalstrategies.InthesecondapproachwetreattheelementsofBasunknownparameters–thestatedependentmixedstrategies.Thesecondelementofthestructure,π,istheinitialmultinomialprobabilitydistributionoverS.Thethirdelement,P,isthen×ntransitionprobabilitymatrix.TheelementPjkistheprobabilityasubjectadoptsstrategykinperiodtconditionaluponhavingadoptedstrategyjinperiodt−1.
Thelikelihoodfunctionof(B,π,P)is
L(B,π,P|y,s)=Pr(y,s|B,π,P).
Rewritingthislikelihoodintermsofthemarginaldistributionsofyandsgivesus
L(B,π,P|y,s)=Pr(y|s,B,π,P)·Pr(s|B,π,P).
Next,weassumethatthemarginaldistributionofyconditionalonsisindependentofπandP.Inotherwords,oncethestateisrealizedthentheprobabilityofaLeftactionreliessolelyonthemixedstrategyofthecurrentstate.Also,bythespecificationoftheHMM,sisindependentofthestatedependentmixedstrategiesB.Thisallowsustorestatethepreviouslikelihoodfunctionas
L(B,π,P|y,s)=Pr(y|s,B)·Pr(s|π,P).Sincethesequenceofstatesforeachsubjectisunobservable,weevaluatethelikelihoodbyintegratingoverthesetofallpossiblesequencesL(B,π,P|y,s)=
Mi=1s∈S
Iyi,1
π(si,1)Bs(1i,1
−Bsi,1)
1−Iyi,1
Tt=2
Iyi,t1−Iyi,t
Psi,t−1,si,tBs(1−B),si,ti,t
whereIyi,tisanindicatorfunctionwhichequalsonefortheactionLeftandzeroforthe
actionRight.AsTgrows,thenumberofcalculationsneededtoevaluatethislikelihoodquicklybecomescomputationallyimpractical.WedescribetheBayesianapproachwetaketoestimatetheHMM,althoughonecouldproceeddownafrequentistpathofmaximizingtheexpectedlikelihoodfunctionusingsomevariationoftheEM(expectedmaximumlikelihood)algorithm.
IntheBayesiananalysis,wefirstfactorthejointposteriordistributionoftheunknownHMMparametersandunobservedstatessintotheproductofmarginalconditionalposteriordistributions.ThenweevaluatethesemarginalconditionalposteriorsthroughaniterativesamplingprocedurecalledtheMarkovChainMonteCarlo(MCMC)method.MCMCisa
5
simplebutpowerfulprocedureinwhichtheempiricaldistributionsofthesampledparametersconvergetothetrueposteriordistributions.Afterconvergence,iterativesamplingiscontinuedtoconstructempiricaldensityfunctions.TheseareusedtomakeinferencesregardingtheparametersofthehiddenMarkovmodels.
ConsidertheposteriordensityfunctionontherealizedunobservedstatesandHMMpa-rametersh(s,B,P,π|y).First,expressthisjointdensityastheproductofthemarginaldensityofHMMparametersconditionalontheobservedactionchoicesandunobservedstateswiththemarginaldensityofthestatesconditionaluponactionchoices
h(s,B,P,π|y)=h(B,P,π|s,y)h(s|y).
WehavealreadyassumedthatthetransitionmatrixPandinitialprobabilitiesoverstatesπareindependentoftheactionchoicesandstatecontingentmixedstrategiesB,whichallowsustostate
h(s,B,P,π|y)=h(B|s,y)h(P,π|s,y)h(s|y).ThisproductofthreeconditionalposteriorspermitsasimpleMarkovChainprocedureofsequentiallysamplingfromthesedistributions.WestartwithsomeinitialarbitraryvaluesfortheHMMparameters,(B(l),P(l),π(l))wherel=0.Wecreates(0)bysimulationusingP(0)andπ(0)withoutconditioningony.Fromtheseinitialparametervaluesandtheobservedactionsequencesy,weuseaGibbssamplingalgorithmtogenerateaninitialsampleofstatesequencess(1).ThenwemakearandomdrawP(1)fromtheposteriordistributionofPconditionalons(1)andy,andproceedsimilarlytomakearandomdrawofπ(1).WecompletetheiterationbymakingarandomdrawB(1)fromtheposteriorofBconditionalons(1)andy.ThekeytotheMCMCmethodisthatasl→∞,thejointandmarginaldistributionsofB(l),P(l),andπ(l)convergeweaklytothejointandmarginalposteriordistributionsoftheseparameters(GemanandGeman,1987).WenowdescribethedetailsofeachstepinaniterationoftheMCMCprocedure.
Step1:Samplingthestatesequencess(l)
WebeginbydescribingaGibbssamplingtechniqueforgeneratingdrawsfromthedistribu-tionofs(l)conditionaluponyand(B(l−1),P(l−1),π(l−1)).Theelementsofsicanbedrawnsequentiallyforeachtconditioningontheobservedactionchoiceyi,t,therealizedstateinotherperiods,π,andP.Letsi,=tbethevectorobtainedbyremovingsi,tfromthesequencesi.Givensi,=t,weexpresstheconditionalposteriordistributionofsi,tas
Pr(si,t|yi,t,B(l−1),P(l−1),si,=t,si,=t)∝Pr(yi,t|si,t,B(l−1))·Pr(si,t|P(l−1),si,=t,si,=t)
6
(l)
(l)
(l−1)
(l)
(l)
(l)
(l−1)
with
Pr(si,t|P(l−1),si,=t,si,=t)=Pr(si,t=k|P(l),si,t−1,si,t+1).
Consequently,theconditionalposteriorprobabilityofsi,t=kandt>1is
(l)Pr(si,t
(l)
(l)
(l−1)
(l)
(l−1)
Pr(yi,t|si,t=k,Bk)·Pr(si,t=k|P(l−1),si,t−1,si,t+1)
,=k|·)=n(l−1)(l)(l−1)
Pr(yi,t|si,t=j,Bj)·Pr(si,t=j|P(l−1),si,t−1,si,t+1)
j=1
(l−1)(l)(l−1)
andfort=1
(l)Pr(si,1
Pr(yi,1|si,1=k,Bk)·Pr(si,1=k|π(l−1),si,2)
=k|·)=.n(l−1)(l−1)
Pr(yi,1|si,1=j,Bj)·Pr(si,1=j|π(l−1),si,2)
j=1
(l−1)(l−1)
Thestatesi,tisdeterminedbymakingarandomdrawfromtheuniformdistributiononthe
(l)
unitinterval,andcomparingthisdrawtothecalculatedconditionalprobabilityofsi,t.
(l)
Step2:SamplingthetransitionmatrixP(l)andπ(l)
TheposteriordistributionsofPjkandπdependonlyupons(l)andthepriors.WespecifythepriorofπasaDirichletdistributionh(π;α1,...,αn)whereαj=1,for1≤j≤n.Similarly,wespecifythepriorofthejthrowofPasaDirichletdistributionh(pj1,...,pjn|ηj1,...,ηjn).Inanexperiment,werecordthedatafromthetruestartoftheHMMprocess,soweassumethatthejointposteriorissimplytheproductofthesetwomarginalposteriors.Therespectiveposteriorsofπ(l)andP(l)are
h(π|s)∝Pr(s|π)h(π;α1,...,αn),
and
h(Pj1,...,Pjn|s)∝Pr(s|Pj1,...,Pjn)h(Pj1,...,Pjn;ηj1,...,ηjn).
Ifν0jisthenumberincidencesofsi,1=jins(l),andνjkisthecountoftransitionsfromstatejtokins(l),thentheconditionalprobabilitiesinthetwoposteriorcalculationsaremultinomialdistributions
ν01n−1
h(π|s)∝π1...πn0−1
ν
(l)
ν0nn−1
·1−πkh(π;α1,...,αn)
k=1
7
and
jn−1
h(Pj1,...,Pjn|s)∝Pj1j1...Pjn−1
νν
νjnn−1
·1−Pjkh(Pj1,...,Pjn;η1,...,ηn).
k=1
SincetheDirichletdistributionistheconjugatepriorforthemultinomialdistribution,these
posteriordistributionsarealsoDirichletdistributionsforwhicheachshapeparameteristhesumofitspriorvalueandtherespectivecount
h(π|s)=h(π;α1+ν01,...,αn+ν0n)
and
h(Pj1,...,Pjn|s)=h(Pj1,...,Pjn;η1+νj1,...,ηn+νjn).
Hence,weselectπ(l)andP(l)betakingrandomdrawsfromthesedistributions.
Step3:SamplingthestatedependentmixedstrategiesB
Forourinitialapproachtomodelingthestatedependentmixedstrategies,weassumeBcorrespondstoaknownsubsetofS.InourBayesiananalysisthisisequivalenttoassumingapointprioronthesestrategies,andthereforethereisnoupdating.SoinourGibbssamplingprocedureweskipthisstep,andproceedtonextiterationoftheGibbssampler.Ofcoursethisisaratherstrongpriortoassume,andweshouldevaluatewhetheritisappropriate.Accordingly,weconductanauxiliaryanalysisinwhichweassumeauniformpriorofthesetofallmixedstrategies.
Intheauxiliaryanalysisweproceedasfollows.Thepriorsofstatedependentmixedstrate-giesB1,...,BnareassumedindependentofeachotherandoftheMarkovprocessgoverningthestates.Giventheseassumptions,wecanthinkofeachBjasaBernoulliprobability,andeachLeft(Right)actionasasuccess(failure)whenoccurringinstatej.Thelikelihoodfunctioniscalculatedasabinomialtrial.Sinceitistheconjugatepriorofthebinomial,weassumethepriorisaBetadistribution,denotedβ(Bj;ζj;γj).Wewantauniformprioraswell,andthatcorrespondstosettingtheshapeparametersζjandγjtoone.Theposteriordistributionissimply
h(Bj|y,s(l))=β(Bj;ζj+κL,j,γj+κR,j),
whereκL,jandκR,jarethenumberoftimestheactionsLeftandRight,respectively,arechosen
(l)
wheninstatejaccordingtos(l).ThestateconditionalmixedstrategiesBj,j=1,...,n,arerandomlydrawnfromtheseBetaposteriordistributions,completinganiterationofthe
8
Gibbssampler.
TheGibbssamplerisrunforalargenumberofiterationsuntiltheempiricaldistributionofalltheparametershasconverged(Geweke,1991).ThenthesamplingprocedureisallowedtocontinuetorunforanothernumberofiterationstobuildupanempiricaldistributionthatcorrespondstotheposteriordistributionoftheHMMparameters.Itisfromthisempiricaldistributionthatweconductstatisticalinferences.
3Theexperiment
WeapplyourHMMframeworktoanewexperimentaldatasetthatprovidesalikelysettingformixedstrategies,andparticularyNashequilibriumstrategies,tobeadopted.Additionally,ourproceduresallowustoestimateforoneplayerrolewithouttheneedtoalsosimultaneouslymodeltheopposingrole,becauseeachhumansubjectrepeatedlyplaysagainstacomputerplayerthatfollowsitsmixedstrategyequilibrium.Eachsubjectisinformedthathisopponentisacomputerbutisgivennoinformationregardingthecomputer’sstrategy.Weadopttwodifferentgamesinourexperimentaldesign,witheachsubjectplayingonlyoneofthetwogames.Onegameiszero-sumandtheothergameisunprofitable.
3.1Thegames
Ourfirstgameisazero-sumasymmetricmatchingpenniesgameintroducedbyRosenthal,Shachat,andWalker(2003).ThenormalformrepresentationofthisgameispresentedontheleftsideofFigure1.ThegameiscalledPursue-EvadebecausetheRowplayer“captures”pointsfromtheColumnplayerwhentheactionsofthetwoplayersmatch,andtheColumnplayer“evades”alosswhentheplayers’actionsdiffer.InthegameeachplayercanmoveeitherLeftorRight,andthegamehasauniqueNashequilibriuminwhicheachplayerchoosesLeftwithprobabilitytwo-thirds.Inequilibrium,Row’sexpectedpayoffistwo-thirds,andcorrespondinglyColumn’sexpectedpayoffisnegativetwo-thirds.
OursecondgameisanunprofitablegameintroducedbyShachatandSwarthout(2004)calledGamble-Safe.EachplayerhasaGambleaction(Leftforeachplayer)whichyieldsapayoffofeithertwoorzero,andaSafeaction(Rightforeachplayer)whichguaranteesapayoffofone.ThenormalformrepresentationofthisgameispresentedontherightsideofFigure1.TheGamble-SafegamehasauniqueNashequilibriuminwhicheachplayerchoosestheLeftactionwithprobabilityone-half,andeachplayerearnsanexpectedequilibriumpayoffofone.Rightistheminimaxstrategyforbothplayerswithaguaranteedpayoffofone.Aumann(1985)arguesthattheNashequilibriumpredictionisnotplausibleinsuchanunprofitablegamebecauseitsadoptionassumesunnecessaryrisktoachievethecorresponding
9
Pursue-EvadeGame
Column Player
Gamble-SafeGame
Column Player
Left Right Left Right
Left
1 , -1
0 , 0
Left
2 , 0
0 , 1
Row Player Row Player Right 0 , 0 2 , -2 Right 1 , 2 1 , 1
Figure1:Theexperimentalgames
Nashequilibriumpayoff.Forexample,imagineRowhasNashequilibriumbeliefsandbest
respondsbyplayingtheNashstrategy.Row’sexpectedpayoffisone.However,supposeColumninsteadadoptshisminimaxstrategyRight.ThisreducesRow’sexpectedpayofftoone-half.Rowcouldavoidthisriskbysimplyplayingtheminimaxstrategy.ThisaspectmakestheGamble-Safegameamorechallengingtestforthehypothesisofmixedstrategyplaythanthezero-sumPursue-Evadegame.
3.2Subjectrecruitmentandexperimentprotocol
WeconductedsixexperimentsessionsintheFinanceandEconomicsExperimentalLaboratory(FEEL)atXiamenUniversityduringDecember2011.Atotalof110undergraduateandmastersstudentsparticipatedintheexperiment,witheachsessioncontainingbetween12and22subjects.subjectswereassignedtothePersue-Evadegametreatment,and56subjectswereassignedtotheGamble-Safegametreatment.SubjectswereevenlydividedintoRowandColumnplayerroleswithineachtreatment.FEELusestheORSEEonlinerecruitmentsystemforsubjectrecruitment(Greiner,2004),andatthetimeoftheexperimentapproximately1400studentswereinthesubjectpool.Asubsetofstudentsfromthesubjectpoolwereinvitedtoattendeachspecificsession,andthesestudentswereinformedthattheywouldreceivea10Yuanshow-uppaymentandhavetheopportunitytoearnmoremoneyduringtheexperiment.Further,theinvitationstatedthatthesessionwouldlastnomorethantwohours.
Uponarrivalatthelaboratory,eachsubjectwasseatedatacomputerworkstationsuchthatnosubjectcouldobserveanothersubject’sscreen.Subjectsfirstreadinstructionsde-tailinghowtoenterdecisionsandhowearningsweredetermined.6Then,200repetitionsofthegamewereplayed.ForthePursue-Evadegame,Columnsubjectswereinitiallyendowedwithabalanceof260tokenseach,andRowsubjectsnone.Eachtokenwasworthone-third
6
Theinstructionsareavailableathttp://www.excen.gsu.edu/swarthout/HMM/
10
ofaYuan.Eachsubject’stotalearningsconsistedofthe10Yuanshow-uppaymentplusthemonetaryvalueofhistokenbalanceafterthe200threpetition.Whileamathematicalpossibility,noColumnsubjectsinthePursue-Evadegamewentbankrupt.
TheexperimentwasconductedwithaJavasoftwareapplicationcreatedattheGeorgiaStateUniversityExperimentalEconomicsCenter(ExCEN)thatallowshumanstoplaynormalformgamesagainstcomputerizedalgorithms.Atthebeginningofeachrepetition,eachsubjectsawagraphicalrepresentationofthegameonthescreen.EachColumnsubject’sgamedisplaywastransformedsothatheappearedtobeaRowplayer.Thus,eachsubjectselectedanactionbyclickingonarow,andthenconfirmedhischoice.Aftertherepetitionwascomplete,eachsubjectsawtheoutcomehighlightedonthegamedisplay,aswellasatextmessagestatingbothplayers’actionsandhisownearningsforthatrepetition.Finally,eachsubject’scurrenttokenbalanceandahistoryofpastplayweredisplayedatalltimes.Thehistoryconsistedofanorderedlistwitheachrowdisplayingtherepetitionnumber,theactionsselectedbybothplayers,andthesubject’spayofffromthespecificrepetition.
3.3Datasummary
WebeginthesummaryoftheexperimentaldatabyprovidingviewsofthejointdistributionoftheproportionofLeftplayforeachsubject-computerpair,whileconditioningonwhetherthedataarefromthefirst100orlast100repetitions.Figures2and3presenttheseviewsforthePursue-EvadeandGamble-Safetreatments,respectively.Ineachofthesefigures,thex-axisistheproportionofLeftplayfortheColumnplayerandthey-axisistheproportionofLeftplayfortheRowplayer.Eacharrowinthefiguresrepresentstheplayofasinglehuman-computerpair,withthearrowtailrepresentingthejointfrequencyofLeftplayinthefirst100repetitionsandthearrowheadrepresentingthejointfrequencyofLeftplayinthefinal100repetitions.Thesearrowsshowtheadjustmentssubjectsmakefromthefirsthalftothesecondhalfofplay.Weseethatmanyarrowssuggestsubstantialchangeinthehumanplayerfrequency,butthechangesdonottrendinanyonedirectionoruniformlytowardstheNashequilibrium.HumanplayalsodisplaysgreaterdispersionanddisplacementfromtheNashequilibriumthanthecomputeropponents,suggestingnonconformitywiththeNashequilibriumpredictions.
Table1presentsthemeansandstandarddeviationsofsubjects’frequenciesofLeftplaybytreatmentandrole.Recallthatwehave2700observationsfortheeachroleinthePursue-Evadetreatmentand2800observationsforeachroleintheGamble-Safetreatment.AlthoughtheRowplayermeanisclosetotheNashequilibriumproportioninbothgametreatments,theNashequilibriumproportionisrejectedforallfourcasesatanyreasonablelevelofsignificance.Ineachofthefourcases,subjects’proportionsofLeftplaydisplaytoomuchvariancetohave
11
HumanRowvs.NEColumn
1.0
1.0
NERowvs.HumanColumn
0.80.8
0.6
Computer Row Proportion Left0.0
0.2
0.4
0.6
0.8
1.0
Human Row Proportion Left0.6
0.40.4
0.20.2
0.00.0
0.0
0.2
0.4
0.6
0.8
1.0
Computer Column Proportion Left
1.01.0
0.80.8
0.6
Computer Row Proportion Left0.0
0.2
0.4
0.6
0.8
1.0
Human Row Proportion Left0.6
0.40.4
0.20.2
0.00.0
0.0
0.2
0.4
0.6
0.8
1.0
Computer Column Proportion Left
Table1:AggregateSummaryStatisticsStatistic
AverageLeftfrequency
StandarddeviationLeftfrequencyNashequilibriumz-teststatistic
P-ERow0.630.11−6.
P-ECol0.510.15−25.06
G-SRow0.480.15−3.18
G-SCol0.300.20−30.20
theLeftactionisplayedinthefirstandsecond100repetitions.Atwo-tailedbinomialtestoftheNashequilibriumatthe95percentlevelofconfidencegivesuscriticalregionsoflessthan58andmorethan76.WerejecttheNashproportionofLeftplayfor13(12)oftheRowsubjectsduringtheinitial(final)100repetitions,andwerejecttheNashproportionfor21(20)oftheColumnsubjectsduringtheinitial(final)100repetitions.
Next,weevaluatewhetherthesubjects’sequencesofactionchoicesareseriallyindependentviaanonparametricrunstest.Thez-teststatistichasadistributionapproximatetothestandardnormalandisafunctionofthesequencelengthR,andthenumberofLeftandRightsequences,rLandrR,respectively.Itsvalueis
rL+rR−z=2rLrRR1−2.
2rLrR(2rLrR−R)
R2(R−1)Thenullhypothesisofthetestisthatasubject’schoicesareindependentrealizationsofabinomialrandomvariable.Weconductatwo-tailedtest.Rejectionsfromlargervaluesoftheteststatisticindicatetoomanyruns,andaresymptomaticofnegativeserialcorrelation.Rejectionsfromsmallervaluesindicatetoofewruns,andaresymptomaticofpositiveserialcorrelation.FortheRowplayers,werejectserialindependencefor10subjectsinthefirsthalfofthesample,andonly4subjectinthesecondhalf.FortheColumnplayers,thenumberofrejectionsis14and10forthefirstandsecondhalf,respectively.ThereisanotablebiaswithrespecttotheColumnplayers;22outof24oftherejectionscomefromzscoresthataretoonegativeandindicatestrongpositiveserialcorrelation.ThisisconsistentwiththeresultsfoundbyRosenthaletal.(2003)intheoriginalstudyofthePursue-Evadegame,butatypicalforotherstudieswhichoftenfindnegativeserialcorrelation.
Table3presentsasimilardatasummaryfortheindividualsubjectsoftheGamble-Safetreatment.Inthiscase,theNashequilibriummixedstrategyisequiprobable,andthecriticalregionsofthetwo-sidedbinomialtestsare39orlessand60ormoreLeftactionchoices.FortheRowplayers,theNashhypothesisisrejectedfor16subjectsinthefirst100repetitionsand15inthesecond100repetitions.CorrespondinglyfortheColumnplayers,theNashhypothesisisrejectedfor25subjectsinthefirsthalfofrepetitionsand21playersinthe
13
Table2:Pursue-Evadeindividualsubjectsummarydata.
RowPlayer
Rounds1-100
Pair1234567101112131415161718192021222324252627
n
eiz
ColumnPlayer
Rounds1-100LeftCount59n50n51n22n,e56n53n41nn63n,e39n,e46nn61en45n50n50n70e41n45n50n70e585887n,e62e41n
RunsStat−4.44i6.23i1.01−2.45i−1.281.85−0.49−3.58i−1.−3.08i−2.16i−2.57i0.72−2.16i−3.76i−2.01i−2.61i−0.72−0.08−3.15i0.40−1.68−1.391.71−2.98i1.04−2.57i
Rounds101-200LeftCount45n53n71e13n,e44n40n76e52n76e29n,e31n,en48n37n,e41nn20n,e52n22n,e25n,e62e47n78n,e66e100n,e5927n,e
RunsStat−3.35i3.47i0.69−2.08i−0.261.47−2.35i0.22−2.35i−3.96i1.70−3.78i−1.39−1.−5.90i−1.55−3.16i−1.390.49−6.86i1.470.−1.86−1.—z1.58−1.90
RunsStat0.20−0.20−0.60−1.22−0.351.81−0.58−0.191.18−1.73−1.120.940.96−0.36−1.091.39−1.−2.51i1.00−0.09−0.45−0.332.09i−0.24−2.67i−0.82−2.55i
Rounds101-200LeftCount71e66e85e65n,e63e37n,e68e48n84n,e55n55n75e5973e66e88n,e73e40n65e78n,ee84n,e60e43n57n75e
LeftCount77n,e67e77n,e49n5939n,e62e47n51n65e48n63e60e78n,e68e70e46n51n80n,e51n68eee42n76e45n
RunsStat−2.40i1.32−0.12−1.601.170.931.040.44−4.82i−4.53i2.63i−2.29i−1.26−1.110.792.19i0.72−2.16i2.02i2.21i0.61−0.12−0.240.42−0.970.42−6.19i
Two-sidedbinomialtestrejectionoftheNEproportionof2/3atthe5%levelofsignificance.Two-sidedbinomialtestrejectionofequiprobableproportionatthe5%levelofsignificance.Runstestrejectionofserialindependenceatthe5%levelofsignificance.Missingvaluesduetoinapplicabilityoftestondatawithzerovariation.
secondhalfofrepetitions.Also,weseethat9Columnplayersubjectsalmostexclusivelyplaythepureminimaxstrategy(over90times)inthelast100repetitions,whilethereisonlyonesuchRowplayer.Further,wefindevidenceofserialcorrelationinmanyindividuals’choicesequences.FortheRowplayers,werejectserialindependencefor12and9subjectsinthefirstandlast100repetitions,respectively.FortheColumnplayers,serialindependenceisrejectedfor12subjectsinthefirsthalfofrepetitionsand5subjectsinthesecondhalfofrepetitions.
14
Table3:Gamble-Safeindividualsubjectsummarydata.
RowPlayer
Rounds1-100
Pair123456710111213141516171819202122232425262728
m
iz
ColumnPlayer
Rounds1-100LeftCount24m19m36m3m26m70m15m63m33m36m62m24m415114m11m73m34m4m8m2m20m12m65m39m38m30m
RunsStat−2.07i−0.91−3.29i−1.0.66−4.80i1.39−1.86−2.78i−1.33−1.730.70−1.33−4.42i1.230.22−2.15i−0.872.09i−2.33i−3.30i0.240.000.42−2.10i−0.12−2.37i−2.i
Rounds101-200LeftCount14m12m440m31m599m65m19m29m5224m19m498m9m598m62m2m4m0m32m32m4031m446m
RunsStat−0.880.42−5.36i
—z2.41i−0.49−0.24−0.77−3.i−1.271.02−0.691.06−0.200.901.02−0.080.900.830.240.44—z−1.281.04−2.51i−1.60−1.49−3.03i
RunsStat3.261.21−1.08−0.060.97−2.57i−1.43−2.50i−6.41i−3.25i−1.244.27i−2.59i2.19i−0.470.131.81−6.50i−1.58−0.670.96−1.−0.32−0.701.010.30−2.62i0.21
Rounds101-200LeftCount39m5176m41m63m5626m77m76m5315m444441504m28m38m5937m20m594963m24m60m
LeftCount434360m4072m504160m16m39m72mm33m32m4135m31m5668m4620m69m4668m80m63m
RunsStat0.61−3.29i−0.63−1.050.17−1.810.13−3.77i−2.22i−0.123.42i3.48i−3.46i−0.81−1.0.961.17−4.75i−0.−2.10i−0.35−2.77i−2.21i0.05−0.742.42i−4.74i−0.13
Two-sidedbinomialtestrejectionofequiprobableproportionatthe5%level.Runstestrejectionofserialindependenceatthe5%levelofsignificance.Missingvaluesduetoinapplicabilityoftestondatawithzerovariation.
4ResultsoftheHMMstatisticalanalysis
InthissectionwepresenttheestimatedHMMsforthePursue-EvadeandGamble-Safetreat-ments.Firstwereportthemeansandvariancesoftheposteriordistributionsofthetransitionprobabilitymatricesandtheinitialdistributionsoverstates.Theestimatesreflectadoptionofbothpureandmixedstrategiesandcharacterizetheswitchingbetweenlatentstrategies.We
15
thenusetheseestimatestogenerateadescriptionofthedynamicsofthelatentmixedstrategyevolution.Finally,weprovideanassessmentoftherobustnessofsomeofourassumedpriors.
4.1Pursue-Evadegame
ForthePursue-Evadegame,werestrictthelatentstatespaceStocontainfourelements.WetreatthecorrespondingvectorofstatedependentmixedstrategiesBasfixedandknown,andthefourelementsarethepureRightstrategy(PR),thefocalequiprobablemixedstrategy(EM),theNashequilibriumstrategy(NE)oftwo-thirds,andthepureLeftstrategy(PL).Specifically,weassumeapointpriorofB=(0,0.5,0.67,1).UsingthispointpriorweestimatetheHMMusingtheMCMCmethod.
WerunthetheGibbssamplerfor10,000iterations.Usingthelast5000iterations,weestablishthattheempiricaldensityfunctionshaveconvergedbyapplyingtheGeweketest(Geweke,1991).Thenweusetheselast5000iterationstomakestatisticalinferences.Table4presentstheestimatedmeansandstandarddeviationsofthetransitionprobabilitiesbetweenstates,thesamefortheinitialprobabilitiesoverstateposteriors,andthecalculatedlimitingdistributionsoftheMarkovchainsforbothRowandColumn.
Table4:Estimatedtransitionmatrices,initialandlimitingdistributionsofPursue-Evadegame
RowPlayer
PRt+1PRtEMtNEtPLt
π
LimitingDistribution
0.75(0.038)0.025(0.009)0.005(0.002)0.022(0.011)0.082(0.063)0.050
EMt+10.145(0.037)0.95(0.012)0.007(0.003)0.023(0.011)0.614(0.13)0.274
NEt+10.051(0.024)0.013(0.006)0.939(0.014)0.218(0.035)0.193(0.12)0.8
PLt+10.0(0.024)0.013(0.006)0.05(0.013)0.737(0.034)0.111(0.08)0.128
PRt+10.752(0.029)0.095(0.028)0.012(0.008)0.039(0.017)0.043(0.040)0.178
ColumnPlayerEMt+10.204(0.03)0.879(0.033)0.021(0.014)0.034(0.016)0.161(0.127)0.385
NEt+10.025(0.015)0.019(0.012)0.96(0.022)0.031(0.015)0.735(0.141)0.356
PLt+10.018(0.01)0.007(0.005)0.007(0.005)0.6(0.026)0.061(0.057)0.080
Note:standarddeviationsareinparentheses.
OurestimationoftheinitialdistributionoverstatesispresentedinthethefifthnumericrowofTable4.Forbothroleswefindinitialplayhasahighrateofmixedstrategyplay.RowplayerspredominatelyfollowtheEM(61%),whiletheColumnplayerspredominantly
16
followtheNE(74%).Interestingly,thisisquitedifferentfromthelimitingdistributionoftheestimatedtransitionmatrices,whichwecaninterpretasthelongrunsteadystateoftheHMM.FortheRowplayer,themodeofthelimitingdistributionistheNE(55%),whilefortheColumnplayerbothEMandNEareroughlyequallylikely,withprobabilitiesof39%and36%,respectively.Clearlythereismovementofstrategyadoptionovertime.
Someaspectsofthesedynamicscanbeseenbyinspectionoftheestimatedtransitionprob-abilities,giveninthefirstfournumericrowsofTable4.Largevaluesonthemaindiagonalsandcorrespondingsmallvaluesontheoff-diagonalsindicatestronginertiainadoptingnewstrategies.Therearesomeinterestingpatternswhenthereisatransitionbetweenstrategies.ConsidertheRowplayersfirst.WhenswitchingawayfromPRaplayerisalmostthreetimesaslikelytoswitchtoEMthaneitheroftheothertwostrategies.Likewise,whenswitchingawayfromEMaplayeristwiceaslikelytoswitchtoPRthaneitheroftheotherstrate-gies.There’sasimilarprobabilisticcyclebetweenNEandPLwithmuchlargerswitchingprobabilitiesbetweenthem.ThedynamiceffectsofthesecyclingtendenciescanbeseeninFigure4,whichpresentstimeseriesoftheestimatedproportionofsubjectsusingeachofthefourstrategies.7ThePLandtheNEseriestendtomirroroneanother,asdothePRandEMstrategies–albeitwithmorenoise.
TheresultsfortheColumnplayersintherighthandsideofFigure4arequitedifferent.TheuseofNEsteadilydeclineswhiletheadoptionofEMrisesinthefirst50repetitions.Fur-thermore,weseeaslowemergenceofPRoverthecourseoftheexperiment.TheprobabilisticcyclebetweentheEMandPRstrategiesisevidentbytheirsharpmirroringpattern.
Row
1.0NEPLPREM
Column
1.0NEPLPREM
0.8ProbabilitiesProbabilities0
50
100Periods
150
200
0.60.40.20.00.00
0.20.40.60.850100Periods
150200
Figure4:StrategydynamicsinPursue-Evadegame
NextweassesstheappropriatenessofourdegenerateprioronBbyconductingtheMCMC
7127·5000Forstrategyjtheestimatedproportionofsubjectsusingthatstrategyinagivenroundtisˆjt=1000027
.l=5001i=1Isli,t=j
17
estimationusingauniformBetaprior,β(Bj;1,1),foreachofthestatedependentmixedstrategies.Wethensamplefromtheposteriordistributionstoconstructanempiricaldensityfunctionforeachofthestatedependentmixedstrategies.InFigure5wepresentkernelsmoothedplotsoftheseapproximationstoposteriordensities.InspectionrevealsfortheRowplayertheposteriorsaresharplypeakedandcloselycenteredonourassumedfourstrategies,exceptfortheNEandtheposteriorwithamodecloseto3/4insteadof2/3.FortheColumnplayerweseethreeoutoffourposteriorscoincidewithourassumedset.TheonedifferenceisthePRandtheposteriorwithamodeofabout0.15.
Row
60505060Column
40DensityDensity0.0
0.2
0.4
0.6
0.8
1.0
302010000.0
102030400.20.40.60.81.0
Probability to Choose LProbability to Choose L
Figure5:PosteriordistributionofBinPursue-Evadegame
4.2TheGamble-Safegame
WenowturnourattentiontotheGamble-Safegame.Here,werestrictthelatentstatespaceStocontainthreeelements.InourestimationwetreatBasfixedandconsistingoftheelementsPR(theminimaxstrategy),EM(whichisalsotheNEstrategy),andPL.WeusethesameparametersfortheGibbsSamplerasweusedinanalyzingthePEdata.
ForboththeRowandColumnplayerdatasetswerantheGibbsSamplerfor10,000iterations,usingthelast5000iterationsforinferenceaftertestingforconvergenceoftheempiricaldensitieswiththeGeweketest.TheposteriormeansandstandarddeviationsarereportedinTable5.ComparingtheestimatedinitialdistributionπtothelimitingdistributionsuggeststhataninitialhighprobabilityofthemixedNashstrategyplayreducesovertimeforbothplayerroles.ThechangefortheColumnplayerismoredramaticasEMgoesfrom60%to40%,andthatreductioncorrespondstoariseintheminimaxstrategyPRfrom34%to53%.
IncontrasttothePEgame,thereisasegregationbetweenmixedstrategyandpure
18
strategyfollowers.EvidenceofthisisfoundintheestimatedMarkovtransitionmatricesaswecanseetheyalmostfailtobeirreducibile(roughlymeaningwecanalwaysreachonestatefromanother,evenifittakesmultipletransitions).TheprobabilityofcontinuingintheEMstateisnearlyone,indicatingthatonceasubjectfollowsthemixedstrategyheislikelytodosoforalargenumberofrepetitions.PurestrategyadoptersexhibitquitedifferentpatternsdependinguponwhethertheyareintheRoworColumnrole,inparticularwithrespecttoswitchingtendenciesinthePLstate.FromthePLstate,RowplayerstransitiontoPRwith26%probability,whilethistransitionprobabilityis79%forColumnplayers.
Table5:Estimatedtransitionmatrices,initialandlimitingdistributionsofGamble-Safegame
RowPlayer
PRt+1PRtEMtPLt
π
LimitingDistribution
0.815(0.027)0.003(0.011)0.260(0.031)0.169(0.084)0.203
EMt+10.010(0.020)0.988(0.021)0.043(0.031)0.779(0.094)0.660
PLt+10.175(0.016)0.009(0.011)0.697(0.037)0.052(0.046)0.137
PRt+10.1(0.010)0.007(0.013)0.791(0.031)0.337(0.098)0.527
ColumnPlayer
EMt+10.006(0.010)0.985(0.019)0.042(0.031)0.596(0.103)0.404
PLt+10.103(0.007)0.008(0.008)0.167(0.044)0.067(0.052)0.059
Note:standarddeviationsareinparentheses.
Figure6presentsthetimeseriesoftheestimatedproportionofsubjectsusingeachofthethreelatentstrategies.HereweseetheimpactoftheMarkovtransitionprobabilitiesthatleadtoinertiaofthemixedstrategystateandalsothestrongcyclingtendenciesofplayersbetweentheLeftandRightpurestrategies.IntheRowplayerfigure,weseetheEMstrategyproportionhasasmoothpaththatdropsquicklyfromitsinitialleveltoitslimitingvaluewithinthefirst50repetitions,afterwhichitremainsrelativelyconstant.Wealsoseetheraggedmirroringpattern,indicatingswitchingbetweenthePRandPLstrategies.WeseesimilarfeaturesintheColumnfigureexceptthattheEMshowsamoregradualdecline,andPRshowsacorrespondinggradualincrease.ThisleadstotheseparationofthePRandPLstrategiesandallowsustoseetheclearshortrunswitchingbetweenthesestrategiescharacterizedbythejaggedmirrorrelationshipbetweentheirrespectiveseries.
WetesttherobustnessofourpointpriorB=[0,0.5,1],byestimatinganHMMforwhichthesestateconditionalstrategieseachhaveauniformBetaprior.ThekernelsmoothedempiricaldensityfunctionsoftheposteriorsarepresentedinFigure7forbothRowand
19
Row
1.00.80.8NEPLPR
Column
1.0NEPLPR
0.60.40.20.0050100Periods
150200
0.00
0.20.40.650100Periods
150200
Figure6:StrategydynamicsinGamble-Safegame
Columnplayers.FortheRowplayer,thelowerandmiddleposteriorsareclosertogetherthanourassumedsets.FortheColumnplayer,theposteriorsofthelowertwostatedependentstrategiesareshiftedtotherightofourassumedones.Weconjecturetheseshiftscouldcomefromerroneouslyassumedhomogeneityofthestrictlymixedstrategyusedbysubjects.AnalternativewouldbetoincreasethenumberofelementsinSortomodeltheindividuals’strictlymixedstrategiescomingfromahierarchicalprocess.
Row
5050Column
4030Density20Density0.0
0.2
0.4
0.6
0.8
1.0
10000.0
102030400.20.40.60.81.0
Probability to Choose LProbability to Choose L
Figure7:PosteriordistributionofBinGamble-Safegame
4.3Forecastingrealizedactions
Untilnowourprimaryconcernhasbeentheestimationofwhensubjectsadoptpureandmixedstrategies,andourHMM’sfunctionhasbeentoprovideastatisticalframeworktotest
20
theoriesaboutlatentstrategychoice.NowweexplorethepotentialoftheHMMtopredictactionstaken;avaluablecapabilityinwidespreadapplicationsfromstrategicmaneuversinmilitaryengagements,toknowingwhenapokerplayerisbluffing.
WefirstconsiderhowwelltheestimatedHMMscoincidewiththeobservedproportionsofLeftplayinourexperimentaldataset.Forthisforecastingexerciseoftheexperimentalpaneldatasetwecalculate,foreachgameandrole,thepredictedproportionofLeftplaybytheM
t,bysubjectsinperiodt,Left
1(Il)βj.Leftt=
M·Ll=1d=1j=1sd,t=j
HereListhelengthofsequenceoftheGibbssamplerweuseforstatisticalinference.Forourdatasetsthissequenceisiterations5001to10000.Figure8presentsplotsofthetimeseriesofthepredictedandactualproportionsofLeftplay.Inallfoursettingsthepredictionstrackthetrendsintheactualdata.Admittedlythisisanin-sampleforecastingexercise,butnonethelessstillimpressive,asminimizingforecasterrorisnottheobjectiveofourstatisticalinferenceexercise.
Out-of-sampleforecastingisofmorepracticaluseandwecanusetheHMMforthispurposeaswell.Weestimate,with10,000iterationsoftheGibbssampler,theHMMforbothpointanduniformBetapriorsonBforthefirst100repetitionsandusetheseestimatestomakeone-step-aheadforecastsofthelast100repetitions.LetΨ=(P,π,B,si,t)10000j=5001denotetherealizeddrawsoftheGibbssamplerforthelast5000iterationsoftheMCMCalgorithmthatareusedforstatisticalinferencefortheuniformBetapriorHMM.Thepredictivedensityofsi,tisobtainedbysimulationfromthejointposteriorsampleΨasfollows:
sˆi,t∼p(si,t|P[j],s˜i,t−1),j=5001,...,10000.
[j]
[j]
L
M
N
(1)
Wecanusethesesampledstatesforsubjectitogeneratethefollowing5000drawsfrom
thefollowingmarginalposteriorsample
yˆi,t∼p(yi,t|sˆi,t,B[j]),j=5001,...,10000.
[j]
[j]
(2)
Theaverageofthe5000drawsmadeaccordingtoEquation2,denotedyˆi,t,isthepredictionof
yi,t.Nextweuseyi,ttogeneratetheposteriordensitys˜i,tbyBayes’Rule.ThisissubstitutedintoEquation1tostarttheprocessofgeneratingthepredictionofyi,t+1.Toassesstheaccuracyofourforecastoftheholdoutsample,wecalculateandreporttheLog-likehood
21
PEHuman-Row
1.0Estimated ProportionEmpirical Proportion
PEHuman-Column
1.0Estimated ProportionEmpirical Proportion
0.8Proportion of Left PlayProportion of Left Play0
50
100Periods
150
200
0.60.40.20.00.00
0.20.40.60.850100Periods
150200
GSHuman-Row
1.0Estimated ProportionEmpirical Proportion
GSHuman-Column
1.0Estimated ProportionEmpirical Proportion
0.8Proportion of Left PlayProportion of Left Play0
50
100Periods
150
200
0.60.40.20.00.00
0.20.40.60.850100Periods
150200
Figure8:ActualandforecastedproportionsofLeftplay
statistic
LL(y|Ψ)=
200Mt=101i=1
ln[Iyi,tp(ˆyi,t)+(1−Iyi,t)(1−p(ˆyi,t))].
WealsoreporttheAkaikeinformationcriterionstatistic,whichisAIC(y,yˆ)=−2·(LL−
numberofmodelparameters).
Inordertoevaluatetheabilityofalternativemodelstopredictthefutureactionsingames,wecomparetheperformancesofone-step-aheadforecastingoftheHMMsofpointanduniformBetapriors(UHMM)onBagainstthealternativesoftheNashequilibriumstrategyandindividual-specificmixedstrategies(IM)whichareestimatedbyeachsubject’sproportionofLeftplayinthefirst100repetitions.
Wesummarizetheout-of-sampleforecastingperformanceforeachofthefourmodelsinTable6.First,fortheRowplayersinbothgametreatmentsthetwoHMMsoutperformthetwoothermodelswhenwedoanddonotpenalizeforthenumberofparameters.Forthe
22
holdoutsampleoftheColumnplayersandnotpenalizingforthenumberofparameters,theIMmodelperformscomparabletothetwoHMMmodelsinthePursue-Evadegame,andtheIMmodelperformscomparabletotheUHMMmodel(bothofwhichoutperformtheHMM)intheGamble-Safegame.However,whenwepenalizeforincreasingnumbersofparametersweseetheUHMMclearlyoutperformstheIMmodel.Thissuggeststhatourhomogeneousdynamicmodelperformswellonforecastingapopulationofgameplayers,butalsosuggeststhatallowingformoreindividualheterogeneitycouldleadtoevenbetteroutofsampleforecastingperformance.
Table6:Outofsampleforecastingperformance
RowPlayer
TreatmentP-EGameG-SGame
StatisticLoglikAICLoglikAIC
NE−17483497−19413884
IM−17363526−19163887
HMM−17233475−188737
UHMM−17103458−18653751
NE−20584117−19413885
ColumnPlayerIM−17753605−14292915
HMM−17763583−15173050
UHMM−17713580−143624
5Discussion
WehaveintroducedaHMMforthedetectionofpureandmixedstrategyplayinrepeatedgames.WethenappliedthismodeltodatafromanewexperimentinwhichhumansubjectsrepeatedlyplayagainstcomputeropponentsthatwereprogrammedtoplaytheirpartofthemixedstrategyNashequilibrium.Wefindthatsubjectsdoplaybothpureandmixedstrategies,andswitchbetweentheseoverthecourseofplay.Further,wefindthereisnon-stationarityinthedistributionoflatentstrategiesovertime.WeobservealargemovementfromtheinitialdistributionoverstrategiestothoseofthelimitingdistributionoftheHMM.However,whilethelimitingdistributionassignsprobabilitytothesubjects’NEstrategy,theassignedprobabilityislessthan1.Thus,forourdata,weshowthatamixedstrategyNashequilibriumisonlypartiallyself-enforcing.Thisisanewresultinbehavioralgametheory,aspreviousstudieshaveonlyconsideredthecompositehypothesisthatmixedstrategyequilibriaarebothself-enforcingandalsothelimitpointofthesubjects’learningprocess.
Ourprimaryinteresthasbeenmodelingapopulationofplayersinteractinginagamewithknownpayoffs,howeverthereareseveralnaturalextensionstoourapproach.First,wecouldfocusonthemodelingandforecastingofasinglesubjectfromthepopulation.Todothis,welikelyneedtoallowmoreindividualheterogeneityintheHMM.Afirststepwouldbetoalloweachplayertohaveasetofindividual-specificstrictmixedstrategiestofollow.This
23
coulddonebyallowingindividualstatedependentmixedstrategiesBis,orbymodelingtheseBisascomingfromahierarchalstructurecharacterizedbyasmallsetofhyperparameters.Asecondextensionistomodelstrategicsituationsinthefield,inwhichthegamepayoffsarenotknownbecauseofunobservedindividualheterogeneity.Forexample,soccerplayersmakingpenaltykicksmayvaryintheirstrengthofkickingleftorright,andsimilarlygoaliesalsomayhaveunobservabledifferencesindefendingkickstotheleftandright.Insuchcases,theHMMcanhelpidentifysuchpayoffsandalsodescribetheplayers’learningprocessregardingtheselatentpayofftypes.
TheHMMaspresentedinthispaperiscurrentlymoreofastatisticaldescriptionthanabehavioralmodelderivedfromoptimizingbehavior.Tobecomesuchabehavioralmodel,thetransitionprobabilitiesmustbecomeanendogenousfunctionofaplayer’sexpectedpayoffsforthedifferinglatentstrategychoices.Onepossibleapproachistoallowaplayertoformbeliefsaboutanopponent’sactionandthenbestrespond.Theissuehereisthatinanexpectedutilityworldamixedstrategyisneverastrictbestresponse.However,ifonetakestheapproachthatuncertaintyaboutanopponent’sactionisambiguous–i.e.,aplayerdoesn’thavetheabilitytoformauniqueprior–thenanambiguity-averseplayermaystrictlypreferamixedstrategyoverpurestrategies.
References
Aumann,RobertJ.(1985),“Onthenon-transferableutilityvalue:AcommentontheRoth-Shaferexamples.”Econometrica,53,667–678.
Bareli,M,OAzar,IRitov,YKeidarlevin,andGSchein(2007),“Actionbiasamongelitesoccergoalkeepers:Thecaseofpenaltykicks.”JournalofEconomicPsychology,28,606–621.
Binmore,Ken,JoeSwierzbinski,andChrisProulx(2001),“Doesminimaxwork?anexperi-mentalstudy.”EconomicJournal,111,445–.
Bloomfield,Robert(1994),“Learningamixedstrategyequilibriuminthelaboratory.”JournalofEconomicBehavior&Organization,25,411–436.
Chiappori,Pierre,StevenLevitt,andT.Groseclose(2002),“Testingmixed-strategyequilibriawhenplayersareheterogeneous:Thecaseofpenaltykicksinsoccer.”AmericanEconomicReview,92,1138–1151.
Geman,StuartandDonaldGeman(1987),“Stochasticrelaxation,Gibbsdistributions,andtheBayesianrestorationofimages.”InReadingsincomputervision:issues,problems,
24
principles,andparadigms(MartinA.FischlerandOscarFirschein,eds.),5–584,MorganKaufmannPublishersInc.,SanFrancisco,CA,USA.
Geweke,John(1991),“Evaluatingtheaccuracyofsampling-basedapproachestothecalcula-tionofposteriormoments.”StaffReport148,FederalReserveBankofMinneapolis.Greiner,Ben(2004),“Anonlinerecruitmentsystemforeconomicexperiments.”InForschungundwissenschaftlichesRechnen(KurtKremerandVolkerMacho,eds.),volume63ofGes.furWiss.Datenverarbeitung,79–93,GWDGBericht.
Morgan,JohnandMartinSefton(2002),“Anexperimentalinvestigationofunprofitablegames.”GamesandEconomicBehavior,40,123–146.
Nash,John(1951),“Non-CooperativeGames.”TheAnnalsofMathematics,,286–295.Noussair,CharlesandMarcWillinger(2011),“Mixedstrategiesinanunprofitablegame:anexperiment.”Workingpapers,LAMETA,UniverstiyofMontpellier.
Nyarko,YawandAndrewSchotter(2002),“Anexperimentalstudyofbelieflearningusingelicitedbeliefs.”Econometrica,70,971–1005.
Ochs,Jack(1995),“Gameswithunique,mixedstrategyequilibria:Anexperimentalstudy.”GamesandEconomicBehavior,10,202–217.
O’Neill,Barry(1987),“Nonmetrictestoftheminimaxtheoryoftwo-personzero-sumgames.”ProceedingsoftheNationalAcademyofSciences,U.S.A.,84,2106–2109.
Palacios-Huerta,Ignacio(2003),“Professionalsplayminimax.”ReviewofEconomicStudies,70,395–415.
Rabiner,LawrenceR.(19),“Atutorialonhiddenmarkovmodelsandselectedapplicationsinspeechrecognition.”ProceedingsoftheIEEE,77,257–286.
Rosenthal,RobertW.,JasonShachat,andMarkWalker(2003),“Hideandseekinarizona.”InternationalJournalofGameTheory,32,273–293.
Selten,ReinhardandThorstenChmura(2008),“Stationaryconceptsforexperimental2x2-games.”AmericanEconomicReview,98,938–966.
Shachat,Jason(2002),“Mixedstrategyplayandtheminimaxhypothesis.”JournalofEco-nomicTheory,104,1–226.
25
Shachat,JasonandJ.ToddSwarthout(2004),“Dowedetectandexploitmixedstrategyplaybyopponents?”MathematicalMethodsofOperationsResearch,59,359–373.
VonNeumann,John(1928),“Zurtheoriedergesellschaftsspiele.”MathematischeAnnalen,100,295–320.
VonNeumann,JohnandOskarMorgenstern(1944),TheoryofGamesandEconomicBehav-ior.PrincetonUniversityPress.
Walker,MarkandJohnWooders(2001),“MinimaxplayatWimbledon.”AmericanEconomicReview,91,1521–1538.
26
因篇幅问题不能全部显示,请点此查看更多更全内容
Copyright © 2019- igat.cn 版权所有 赣ICP备2024042791号-1
违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com
本站由北京市万商天勤律师事务所王兴未律师提供法律服务