您的当前位置：首页 A hidden Markov model for the detection of pure and mixed strategy play in games

A hidden Markov model for the detection of pure and mixed strategy play in games

来源：爱go旅游网

AhiddenMarkovmodelforthedetectionofpureand

mixedstrategyplayingames∗

JasonShachat†J.ToddSwarthout‡

LijiaWei§July7,2012

Abstract

Weproposeastatisticalmodeltoassesswhetherindividualsstrategicallyusemixedstrategiesinrepeatedgames.WeformulateahiddenMarkovmodelinwhichthelatentstatespacecontainsbothpureandmixedstrategies,andallowsswitchingbetweenthesestates.WeapplythemodeltodatafromanexperimentinwhichhumansubjectsrepeatedlyplayanormalformgameagainstacomputerthatalwaysfollowsitspartoftheuniquemixedstrategyNashequilibriumproﬁle.Estimatedresultsshowsigniﬁcantmixedstrategyplayandnon-stationarydynamics.Wealsoexploretheabilityofthemodeltoforecastactionchoice.

JELclassiﬁcation:C92;C72;C10

Keywords:MixedStrategy;NashEquilibrium;Experiment;HiddenMarkovModel

Thispapersupersedesthepreviousworkingpaper,“ManversusNash:Anexperimentontheselfenforcingnatureofmixedstrategyequilibrium.”†

WangYananInstituteforStudiesinEconomicsandMOEKeyLaboratoryinEconometrics,XiamenUniversity.jason.shachat@gmail.com‡

DepartmentofEconomicsandExperimentalEconomicsCenter,GeorgiaStateUniversity.swarthout@gsu.edu§

WangYananInstituteforStudiesinEconomicsandMOEKeyLaboratoryinEconometrics,XiamenUniversity.ljwie.wise@gmail.com

∗

1Introduction

GametheoryandtheNashequilibriumsolutionconceptareakeyframeworkinthesocialsciencesformodelinginteractivebehavior.Theformulationofanormalformgameconsistsofasetofplayers,asetofpossibleactionsforeachplayer,andapayoﬀfunctionforeachplayerthatgivesareal-valuedpayoﬀforanypossiblejointactionproﬁle–alistofactionsconsistingofoneforeachplayer.ANashequilibriumisajointactionproﬁlesuchthateachplayer’sassignedactionresultsinatleastashighapayoﬀtotheplayerasanyotherpossibleaction,assumingallotherplayerschoosetheirrespectiveactionsintheNashequilibriumproﬁle.Ifplayersarerestrictedtodeterministicallychooseanaction,thentherearemanygamesthatdon’thaveaNashequilibrium,suchasthechildhoodgameofRock,Scissors,Paper.Confrontedwiththisproblem,VonNeumann(1928)generalizedaplayer’sdecisionfromchoosinganactiontochoosingaprobabilitydistributionoverhispossibleactions.1Thischoiceofaprobabilitydistributioniscalleda“mixed”strategy,andadegeneratemixedstrategywhichchoosesaparticularactionwithprobabilityoneiscalleda“pure”strategy.Theintroductionofmixedstrategiesallowsforexistenceofequilibriumacrossabroadclassofgames:fromminimaxsolutionsforzero-sumgames(VonNeumann,1928;VonNeumannandMorgenstern,1944)tononcooperativeequilibriaforn-persongames(Nash,1951).Whiletheroleofmixedstrategiesindeﬁninglogicallyconsistentsolutionconceptsisnotindoubt,thepositiveaspectofindividualsactuallyplayingmixedstrategiesisanopenquestionofconsiderableinterest.

Researchers’eﬀortstoanswerthisquestionhavenaturallyfocusedonsettingswheretheuseofmixedstrategiesismostcompelling:therepeatedplayofgameswhichhaveauniquemixedstrategyNashequilibrium.Thevalueof“beingunpredictable”isreadilyseeninexamplessuchasservesintennis,“bluﬃng”inpoker,andwhetherornotataxauthorityauditsataxpayer.Acommonapproachinthisliteratureistotestwhethertheplayers’actionchoicesareconsistentwiththemixedstrategyequilibrium.Somestudiesusingcontrolledexperimentswithhumansubjectshavetheadvantageofknowingthepayoﬀfunctions,andtestwhetherchoicefrequenciesagreewiththeequilibriumstrategiesandwhetherplayers’sequencesofactionsareseriallyindependent(O’Neill,1987;Binmore,Swierzbinski,andProulx,2001;MorganandSefton,2002;SeltenandChmura,2008).Otherstudiesconsiderhigh-levelsportscompetitions,suchassoccer(Chiappori,Levitt,andGroseclose,2002;Palacios-Huerta,2003;Bareli,Azar,Ritov,Keidarlevin,andSchein,2007)andtennis(WalkerandWooders,2001),withtheadvantageofstudyinghighlyexperiencedplayerscompetingforhighstakesand

Alongwithgeneralizingthesetoffeasibleactionstothesetofmixedstrategies,aplayer’spayoﬀfunctionisextendedbysettingitsvaluetotheexpectedpayoﬀgivenaproﬁleofmixedstrategies,commonlyreferredtoastheexpectedutilityproperty.

thedisadvantageofunknownpayoﬀfunctions.2Thesestudiesfocusontestingtheserialindependenceofactionchoiceandtheequilibriumimplicationofequalpayoﬀsacrossactionchoices.Someofthemostprominentandrecurringresultsforbothtypesofstudiesarethataggregatedactionfrequenciesacrossplayersagreewiththeequilibriummixedstrategiesbutindividualactionfrequenciesdonot,andformanyindividualsactionchoicesareseriallycorrelatedviolatingtheindependenceprediction.

Toreconciletheseissuesofserialcorrelationandheterogeneity,severalstudies(Ochs,1995;Bloomﬁeld,1994;Shachat,2002;NoussairandWillinger,2011)conductlaboratoryexperi-mentsusingthesametypeofgamesbutdirectlyelicitmixedstrategiesbyobligatingplayerstoselectaprobabilitydistributionoveractions.3Elicitedstrategiesintheseexperimentsex-hibitvariousdistinctpatterns.Somesubjectschoosepurestrategiesalmostexclusively,somechoosestrictlymixedstrategiesalmostexclusively,andothersusebothtypesofstrategies–usuallyinlongsequences.Also,certainmixedstrategiesareoftenquitefocal,suchaschoosingequalprobabilityweightonasubsetofactionsratherthantheNashequilibriumproportions.Naiveinterpretationoftheseresultssuggestsacleardistinctionbetweenplaythatispurposelyunpredictableandplaythatisapurebestresponsetochangingforecastsofanopponent’saction(NyarkoandSchotter,2002).Amorecautiousinterpretationisthatsubjectsmayes-chewtherandomizingdeviceprovidedbytheexperimenterandinsteadinternallyrandomize,orperhapssubjectschoosestrictlymixedstrategiesduetotheexperimentereﬀectofthenovelelicitationmethod.Clearlyalessinvasivemethodtodetectmixedstrategyplaywouldbevaluable.

InthisstudyweproposeahiddenMarkovmodel(HMM)todetectwhetherobservedactionchoicesaretheresultofpureormixedstrategiesplayinrepeatedtwo-personﬁniteactiongames.4Therearethreekeyideasinourformulation:(1)wetreatthestrategyaplayerfollowsasalatentstateandtheactionplayedastheobservableoutputfromthelatentstrategy;(2)thesetofpossiblelatentstatesisadiscretesubsetofallpossiblemixedstrategiescontainingpurestrategies,Nashequilibriumorminimaxstrategies,andfocalmixedstrategies;and(3)aplayerswitchesthelatentstrategyhefollowsaccordingtoaﬁrstorderMarkovprocess.Wethendemonstratetheabilityofthemodelbyapplyingittoanewexperimentaldatasetwecollect.Inourexperiment,eachhumansubjectrepeatedlyplaysa2×2gameagainstacomputerplayerthatfollowsitsmixedstrategyequilibrium.Somesubjectsplayazero-Theactionsetsaretypicallycomprisedofsimpleactions,e.g.,{serveleft,serveright}and{defendleft,defendright}.Thepayoﬀsareassumedtobetheprobabilityofwinningthetaskandtheseprobabilitieswilldiﬀerbaseduponboththecomparativeskillsbetweenplayersandtherelativestrengthsaplayerhasforeachaction.3

Forexample,Shachat(2002)adoptsagamewithfouractions,eachidentiﬁedbyadiﬀerentcolor,foreachplayer.Eachplayermustﬁllaboxwith100cardsinanycombinationofthefourcoloredcardtypes,andthenonecardisselectedatrandomtodeterminetheactionplayed.4

SeeRabiner(19)foraclassicintroductiontohiddenMarkovmodels.

sumgameandothersanunproﬁtablegame.5TheestimatedHMMsrevealseveralinterestingresults,including:(1)signiﬁcantamountsofbothpureandmixedstrategyplay;(2)thefocalequiprobablemixedstrategyisplayedmoreoftenthantheNashequilibriumstrategy;(3)lowtransitionprobabilitiesbetweenmixedandpurelatentstrategies;(4)dynamicadjustmentsinthetypesofstrategiesplayersfollowovertime;and(5)appreciableratesofbothmixedandpurestrategyplayinthelimitingdistributionsoftheHMMs(interpretedasthelongrunequilibriumofplay).WethenextendtheHMMfromastatisticalframeworkforevaluatinghypothesestooneforforecastingactionchoiceandassessitspredictiveaccuracy.

2AHMMofswitchingstrategies

ConsideranexperimentinwhichweobserveMpairsofsubjects,eachplayingTperiodsofthesame2×2normalformgame.Oftengameslikethisaredescribedbyatwo-by-twotable,andforfamiliaritypurposeswedenoteonesubject’splayerroleasRowandtheotherasColumn.Welabeleachplayerrole’stwopossibleactionsLeft(L)andRight(R),andexpressasubject’smixedstrategyastheprobabilityofplayingL.OfparticularinterestiswhenthegamehasasingleNashequilibriumanditisinstrictlymixedstrategies,althoughourframeworkisnotrestrictedtostudyonlysuchcases.Threefactorsconfoundingtheanalysisdatageneratedbythistypeofprocessarethelatencyofplayers’mixedstrategies,theheterogeneityofstrategyadoptionacrosssubjects,andvariationofadoptedlatentstrategiesoverthecourseofrepeatedplay.Inthissection,wepresentamodelthataccommodatesandallowsestimationoftheseconfounds.

ConsiderthefollowingHMMforaﬁxedplayerrole.ThestatespaceSisann-elementsubsetofthesubjecti’spossiblemixedstrategies.Denotesi,t∈Sforthestrategyusedbysubjectiinperiodt,SiisthesetofallpossibleTelementsequencesofmixedstrategiesforiwithtypicalelementsi,andletsbethecollectionofsiforallMsubjectsinagivenplayerrole.Letyi,tdenotesubjecti’srealizedactioninperiodt,yiisthecorrespondingTelementsequenceofi’sobservableactions,andyisthecollectionofyiforallMsubjects.View{y,s}astheoutputoftheHMM.

TheprobabilitystructureoftheHMMhasthreeelements.First,then-elementvectorBforwhichtheelementBjistheprobabilityasubjectchoosesactionLeft,i.e.themixedstrategy,ifheisinstatej.WewillprovidetwoanalyseswhichdiﬀerinhowwespecifyB.InoneapproachweconsiderBasknownapriori,andSandBareredundantnotation.Usually,inthisapproach,Bcontainsthetwopurestrategies,otherstrategiessuggestedbytheorysuch

AnunproﬁtablegameisoneinwhichtheminimaxandNashequilibriumsolutionsaredistinctbutyieldthesameexpectedpayoﬀforeachplayer.

asNashequilibriumorminimax,andotherfocalstrategies.InthesecondapproachwetreattheelementsofBasunknownparameters–thestatedependentmixedstrategies.Thesecondelementofthestructure,π,istheinitialmultinomialprobabilitydistributionoverS.Thethirdelement,P,isthen×ntransitionprobabilitymatrix.TheelementPjkistheprobabilityasubjectadoptsstrategykinperiodtconditionaluponhavingadoptedstrategyjinperiodt−1.

Thelikelihoodfunctionof(B,π,P)is

L(B,π,P|y,s)=Pr(y,s|B,π,P).

Rewritingthislikelihoodintermsofthemarginaldistributionsofyandsgivesus

L(B,π,P|y,s)=Pr(y|s,B,π,P)·Pr(s|B,π,P).

Next,weassumethatthemarginaldistributionofyconditionalonsisindependentofπandP.Inotherwords,oncethestateisrealizedthentheprobabilityofaLeftactionreliessolelyonthemixedstrategyofthecurrentstate.Also,bythespeciﬁcationoftheHMM,sisindependentofthestatedependentmixedstrategiesB.Thisallowsustorestatethepreviouslikelihoodfunctionas

L(B,π,P|y,s)=Pr(y|s,B)·Pr(s|π,P).Sincethesequenceofstatesforeachsubjectisunobservable,weevaluatethelikelihoodbyintegratingoverthesetofallpossiblesequencesL(B,π,P|y,s)=

M󰀆󰀇i=1s∈S

I󰀄yi,1󰀅

π(si,1)Bs(1i,1

−Bsi,1)

1−I󰀄yi,1󰀅

T󰀇t=2

I󰀄yi,t󰀅1−I󰀄yi,t󰀅

Psi,t−1,si,tBs(1−B),si,ti,t

whereI󰀍yi,t󰀎isanindicatorfunctionwhichequalsonefortheactionLeftandzeroforthe

actionRight.AsTgrows,thenumberofcalculationsneededtoevaluatethislikelihoodquicklybecomescomputationallyimpractical.WedescribetheBayesianapproachwetaketoestimatetheHMM,althoughonecouldproceeddownafrequentistpathofmaximizingtheexpectedlikelihoodfunctionusingsomevariationoftheEM(expectedmaximumlikelihood)algorithm.

IntheBayesiananalysis,weﬁrstfactorthejointposteriordistributionoftheunknownHMMparametersandunobservedstatessintotheproductofmarginalconditionalposteriordistributions.ThenweevaluatethesemarginalconditionalposteriorsthroughaniterativesamplingprocedurecalledtheMarkovChainMonteCarlo(MCMC)method.MCMCisa

simplebutpowerfulprocedureinwhichtheempiricaldistributionsofthesampledparametersconvergetothetrueposteriordistributions.Afterconvergence,iterativesamplingiscontinuedtoconstructempiricaldensityfunctions.TheseareusedtomakeinferencesregardingtheparametersofthehiddenMarkovmodels.

ConsidertheposteriordensityfunctionontherealizedunobservedstatesandHMMpa-rametersh(s,B,P,π|y).First,expressthisjointdensityastheproductofthemarginaldensityofHMMparametersconditionalontheobservedactionchoicesandunobservedstateswiththemarginaldensityofthestatesconditionaluponactionchoices

h(s,B,P,π|y)=h(B,P,π|s,y)h(s|y).

WehavealreadyassumedthatthetransitionmatrixPandinitialprobabilitiesoverstatesπareindependentoftheactionchoicesandstatecontingentmixedstrategiesB,whichallowsustostate

h(s,B,P,π|y)=h(B|s,y)h(P,π|s,y)h(s|y).ThisproductofthreeconditionalposteriorspermitsasimpleMarkovChainprocedureofsequentiallysamplingfromthesedistributions.WestartwithsomeinitialarbitraryvaluesfortheHMMparameters,(B(l),P(l),π(l))wherel=0.Wecreates(0)bysimulationusingP(0)andπ(0)withoutconditioningony.Fromtheseinitialparametervaluesandtheobservedactionsequencesy,weuseaGibbssamplingalgorithmtogenerateaninitialsampleofstatesequencess(1).ThenwemakearandomdrawP(1)fromtheposteriordistributionofPconditionalons(1)andy,andproceedsimilarlytomakearandomdrawofπ(1).WecompletetheiterationbymakingarandomdrawB(1)fromtheposteriorofBconditionalons(1)andy.ThekeytotheMCMCmethodisthatasl→∞,thejointandmarginaldistributionsofB(l),P(l),andπ(l)convergeweaklytothejointandmarginalposteriordistributionsoftheseparameters(GemanandGeman,1987).WenowdescribethedetailsofeachstepinaniterationoftheMCMCprocedure.

Step1:Samplingthestatesequencess(l)

WebeginbydescribingaGibbssamplingtechniqueforgeneratingdrawsfromthedistribu-tionofs(l)conditionaluponyand(B(l−1),P(l−1),π(l−1)).Theelementsofsicanbedrawnsequentiallyforeachtconditioningontheobservedactionchoiceyi,t,therealizedstateinotherperiods,π,andP.Letsi,=tbethevectorobtainedbyremovingsi,tfromthesequencesi.Givensi,=t,weexpresstheconditionalposteriordistributionofsi,tas

Pr(si,t|yi,t,B(l−1),P(l−1),si,=t,si,=t)∝Pr(yi,t|si,t,B(l−1))·Pr(si,t|P(l−1),si,=t,si,=t)

(l)

(l−1)

(l)

(l−1)

with

Pr(si,t|P(l−1),si,=t,si,=t)=Pr(si,t=k|P(l),si,t−1,si,t+1).

Consequently,theconditionalposteriorprobabilityofsi,t=kandt>1is

(l)Pr(si,t

(l)

(l−1)

(l)

(l−1)

Pr(yi,t|si,t=k,Bk)·Pr(si,t=k|P(l−1),si,t−1,si,t+1)

,=k|·)=󰀅n(l−1)(l)(l−1)

Pr(yi,t|si,t=j,Bj)·Pr(si,t=j|P(l−1),si,t−1,si,t+1)

j=1

(l−1)(l)(l−1)

andfort=1

(l)Pr(si,1

Pr(yi,1|si,1=k,Bk)·Pr(si,1=k|π(l−1),si,2)

=k|·)=󰀅.n(l−1)(l−1)

Pr(yi,1|si,1=j,Bj)·Pr(si,1=j|π(l−1),si,2)

j=1

(l−1)(l−1)

Thestatesi,tisdeterminedbymakingarandomdrawfromtheuniformdistributiononthe

(l)

unitinterval,andcomparingthisdrawtothecalculatedconditionalprobabilityofsi,t.

(l)

Step2:SamplingthetransitionmatrixP(l)andπ(l)

TheposteriordistributionsofPjkandπdependonlyupons(l)andthepriors.WespecifythepriorofπasaDirichletdistributionh(π;α1,...,αn)whereαj=1,for1≤j≤n.Similarly,wespecifythepriorofthejthrowofPasaDirichletdistributionh(pj1,...,pjn|ηj1,...,ηjn).Inanexperiment,werecordthedatafromthetruestartoftheHMMprocess,soweassumethatthejointposteriorissimplytheproductofthesetwomarginalposteriors.Therespectiveposteriorsofπ(l)andP(l)are

h(π|s)∝Pr(s|π)h(π;α1,...,αn),

and

h(Pj1,...,Pjn|s)∝Pr(s|Pj1,...,Pjn)h(Pj1,...,Pjn;ηj1,...,ηjn).

Ifν0jisthenumberincidencesofsi,1=jins(l),andνjkisthecountoftransitionsfromstatejtokins(l),thentheconditionalprobabilitiesinthetwoposteriorcalculationsaremultinomialdistributions

ν01n−1

h(π|s)∝π1...πn0−1

(l)

󰀃󰀄ν0nn−1󰀆

·1−πkh(π;α1,...,αn)

k=1

and

jn−1

h(Pj1,...,Pjn|s)∝Pj1j1...Pjn−1

νν

󰀃󰀄νjnn−1󰀆

·1−Pjkh(Pj1,...,Pjn;η1,...,ηn).

k=1

SincetheDirichletdistributionistheconjugatepriorforthemultinomialdistribution,these

posteriordistributionsarealsoDirichletdistributionsforwhicheachshapeparameteristhesumofitspriorvalueandtherespectivecount

h(π|s)=h(π;α1+ν01,...,αn+ν0n)

and

h(Pj1,...,Pjn|s)=h(Pj1,...,Pjn;η1+νj1,...,ηn+νjn).

Hence,weselectπ(l)andP(l)betakingrandomdrawsfromthesedistributions.

Step3:SamplingthestatedependentmixedstrategiesB

Forourinitialapproachtomodelingthestatedependentmixedstrategies,weassumeBcorrespondstoaknownsubsetofS.InourBayesiananalysisthisisequivalenttoassumingapointprioronthesestrategies,andthereforethereisnoupdating.SoinourGibbssamplingprocedureweskipthisstep,andproceedtonextiterationoftheGibbssampler.Ofcoursethisisaratherstrongpriortoassume,andweshouldevaluatewhetheritisappropriate.Accordingly,weconductanauxiliaryanalysisinwhichweassumeauniformpriorofthesetofallmixedstrategies.

Intheauxiliaryanalysisweproceedasfollows.Thepriorsofstatedependentmixedstrate-giesB1,...,BnareassumedindependentofeachotherandoftheMarkovprocessgoverningthestates.Giventheseassumptions,wecanthinkofeachBjasaBernoulliprobability,andeachLeft(Right)actionasasuccess(failure)whenoccurringinstatej.Thelikelihoodfunctioniscalculatedasabinomialtrial.Sinceitistheconjugatepriorofthebinomial,weassumethepriorisaBetadistribution,denotedβ(Bj;ζj;γj).Wewantauniformprioraswell,andthatcorrespondstosettingtheshapeparametersζjandγjtoone.Theposteriordistributionissimply

h(Bj|y,s(l))=β(Bj;ζj+κL,j,γj+κR,j),

whereκL,jandκR,jarethenumberoftimestheactionsLeftandRight,respectively,arechosen

(l)

wheninstatejaccordingtos(l).ThestateconditionalmixedstrategiesBj,j=1,...,n,arerandomlydrawnfromtheseBetaposteriordistributions,completinganiterationofthe

Gibbssampler.

TheGibbssamplerisrunforalargenumberofiterationsuntiltheempiricaldistributionofalltheparametershasconverged(Geweke,1991).ThenthesamplingprocedureisallowedtocontinuetorunforanothernumberofiterationstobuildupanempiricaldistributionthatcorrespondstotheposteriordistributionoftheHMMparameters.Itisfromthisempiricaldistributionthatweconductstatisticalinferences.

3Theexperiment

WeapplyourHMMframeworktoanewexperimentaldatasetthatprovidesalikelysettingformixedstrategies,andparticularyNashequilibriumstrategies,tobeadopted.Additionally,ourproceduresallowustoestimateforoneplayerrolewithouttheneedtoalsosimultaneouslymodeltheopposingrole,becauseeachhumansubjectrepeatedlyplaysagainstacomputerplayerthatfollowsitsmixedstrategyequilibrium.Eachsubjectisinformedthathisopponentisacomputerbutisgivennoinformationregardingthecomputer’sstrategy.Weadopttwodiﬀerentgamesinourexperimentaldesign,witheachsubjectplayingonlyoneofthetwogames.Onegameiszero-sumandtheothergameisunproﬁtable.

3.1Thegames

Ourﬁrstgameisazero-sumasymmetricmatchingpenniesgameintroducedbyRosenthal,Shachat,andWalker(2003).ThenormalformrepresentationofthisgameispresentedontheleftsideofFigure1.ThegameiscalledPursue-EvadebecausetheRowplayer“captures”pointsfromtheColumnplayerwhentheactionsofthetwoplayersmatch,andtheColumnplayer“evades”alosswhentheplayers’actionsdiﬀer.InthegameeachplayercanmoveeitherLeftorRight,andthegamehasauniqueNashequilibriuminwhicheachplayerchoosesLeftwithprobabilitytwo-thirds.Inequilibrium,Row’sexpectedpayoﬀistwo-thirds,andcorrespondinglyColumn’sexpectedpayoﬀisnegativetwo-thirds.

OursecondgameisanunproﬁtablegameintroducedbyShachatandSwarthout(2004)calledGamble-Safe.EachplayerhasaGambleaction(Leftforeachplayer)whichyieldsapayoﬀofeithertwoorzero,andaSafeaction(Rightforeachplayer)whichguaranteesapayoﬀofone.ThenormalformrepresentationofthisgameispresentedontherightsideofFigure1.TheGamble-SafegamehasauniqueNashequilibriuminwhicheachplayerchoosestheLeftactionwithprobabilityone-half,andeachplayerearnsanexpectedequilibriumpayoﬀofone.Rightistheminimaxstrategyforbothplayerswithaguaranteedpayoﬀofone.Aumann(1985)arguesthattheNashequilibriumpredictionisnotplausibleinsuchanunproﬁtablegamebecauseitsadoptionassumesunnecessaryrisktoachievethecorresponding

Pursue-EvadeGame

Column Player

Gamble-SafeGame

Column Player

Left Right Left Right

Left

1 , -1

0 , 0

Left

2 , 0

0 , 1

Row Player Row Player Right 0 , 0 2 , -2 Right 1 , 2 1 , 1

Figure1:Theexperimentalgames

Nashequilibriumpayoﬀ.Forexample,imagineRowhasNashequilibriumbeliefsandbest

respondsbyplayingtheNashstrategy.Row’sexpectedpayoﬀisone.However,supposeColumninsteadadoptshisminimaxstrategyRight.ThisreducesRow’sexpectedpayoﬀtoone-half.Rowcouldavoidthisriskbysimplyplayingtheminimaxstrategy.ThisaspectmakestheGamble-Safegameamorechallengingtestforthehypothesisofmixedstrategyplaythanthezero-sumPursue-Evadegame.

3.2Subjectrecruitmentandexperimentprotocol

WeconductedsixexperimentsessionsintheFinanceandEconomicsExperimentalLaboratory(FEEL)atXiamenUniversityduringDecember2011.Atotalof110undergraduateandmastersstudentsparticipatedintheexperiment,witheachsessioncontainingbetween12and22subjects.subjectswereassignedtothePersue-Evadegametreatment,and56subjectswereassignedtotheGamble-Safegametreatment.SubjectswereevenlydividedintoRowandColumnplayerroleswithineachtreatment.FEELusestheORSEEonlinerecruitmentsystemforsubjectrecruitment(Greiner,2004),andatthetimeoftheexperimentapproximately1400studentswereinthesubjectpool.Asubsetofstudentsfromthesubjectpoolwereinvitedtoattendeachspeciﬁcsession,andthesestudentswereinformedthattheywouldreceivea10Yuanshow-uppaymentandhavetheopportunitytoearnmoremoneyduringtheexperiment.Further,theinvitationstatedthatthesessionwouldlastnomorethantwohours.

Uponarrivalatthelaboratory,eachsubjectwasseatedatacomputerworkstationsuchthatnosubjectcouldobserveanothersubject’sscreen.Subjectsﬁrstreadinstructionsde-tailinghowtoenterdecisionsandhowearningsweredetermined.6Then,200repetitionsofthegamewereplayed.ForthePursue-Evadegame,Columnsubjectswereinitiallyendowedwithabalanceof260tokenseach,andRowsubjectsnone.Eachtokenwasworthone-third

Theinstructionsareavailableathttp://www.excen.gsu.edu/swarthout/HMM/

ofaYuan.Eachsubject’stotalearningsconsistedofthe10Yuanshow-uppaymentplusthemonetaryvalueofhistokenbalanceafterthe200threpetition.Whileamathematicalpossibility,noColumnsubjectsinthePursue-Evadegamewentbankrupt.

TheexperimentwasconductedwithaJavasoftwareapplicationcreatedattheGeorgiaStateUniversityExperimentalEconomicsCenter(ExCEN)thatallowshumanstoplaynormalformgamesagainstcomputerizedalgorithms.Atthebeginningofeachrepetition,eachsubjectsawagraphicalrepresentationofthegameonthescreen.EachColumnsubject’sgamedisplaywastransformedsothatheappearedtobeaRowplayer.Thus,eachsubjectselectedanactionbyclickingonarow,andthenconﬁrmedhischoice.Aftertherepetitionwascomplete,eachsubjectsawtheoutcomehighlightedonthegamedisplay,aswellasatextmessagestatingbothplayers’actionsandhisownearningsforthatrepetition.Finally,eachsubject’scurrenttokenbalanceandahistoryofpastplayweredisplayedatalltimes.Thehistoryconsistedofanorderedlistwitheachrowdisplayingtherepetitionnumber,theactionsselectedbybothplayers,andthesubject’spayoﬀfromthespeciﬁcrepetition.

3.3Datasummary

WebeginthesummaryoftheexperimentaldatabyprovidingviewsofthejointdistributionoftheproportionofLeftplayforeachsubject-computerpair,whileconditioningonwhetherthedataarefromtheﬁrst100orlast100repetitions.Figures2and3presenttheseviewsforthePursue-EvadeandGamble-Safetreatments,respectively.Ineachoftheseﬁgures,thex-axisistheproportionofLeftplayfortheColumnplayerandthey-axisistheproportionofLeftplayfortheRowplayer.Eacharrowintheﬁguresrepresentstheplayofasinglehuman-computerpair,withthearrowtailrepresentingthejointfrequencyofLeftplayintheﬁrst100repetitionsandthearrowheadrepresentingthejointfrequencyofLeftplayintheﬁnal100repetitions.Thesearrowsshowtheadjustmentssubjectsmakefromtheﬁrsthalftothesecondhalfofplay.Weseethatmanyarrowssuggestsubstantialchangeinthehumanplayerfrequency,butthechangesdonottrendinanyonedirectionoruniformlytowardstheNashequilibrium.HumanplayalsodisplaysgreaterdispersionanddisplacementfromtheNashequilibriumthanthecomputeropponents,suggestingnonconformitywiththeNashequilibriumpredictions.

Table1presentsthemeansandstandarddeviationsofsubjects’frequenciesofLeftplaybytreatmentandrole.Recallthatwehave2700observationsfortheeachroleinthePursue-Evadetreatmentand2800observationsforeachroleintheGamble-Safetreatment.AlthoughtheRowplayermeanisclosetotheNashequilibriumproportioninbothgametreatments,theNashequilibriumproportionisrejectedforallfourcasesatanyreasonablelevelofsigniﬁcance.Ineachofthefourcases,subjects’proportionsofLeftplaydisplaytoomuchvariancetohave

HumanRowvs.NEColumn

1.0

NERowvs.HumanColumn

0.80.8

0.6

Computer Row Proportion Left0.0

0.2

0.4

0.6

0.8

1.0

Human Row Proportion Left0.6

0.40.4

0.20.2

0.00.0

0.0

0.2

0.4

0.6

0.8

1.0

Computer Column Proportion Left

1.01.0

0.80.8

0.6

Computer Row Proportion Left0.0

0.2

0.4

0.6

0.8

1.0

Human Row Proportion Left0.6

0.40.4

0.20.2

0.00.0

0.0

0.2

0.4

0.6

0.8

1.0

Computer Column Proportion Left

Table1:AggregateSummaryStatisticsStatistic

AverageLeftfrequency

StandarddeviationLeftfrequencyNashequilibriumz-teststatistic

P-ERow0.630.11−6.

P-ECol0.510.15−25.06

G-SRow0.480.15−3.18

G-SCol0.300.20−30.20

theLeftactionisplayedintheﬁrstandsecond100repetitions.Atwo-tailedbinomialtestoftheNashequilibriumatthe95percentlevelofconﬁdencegivesuscriticalregionsoflessthan58andmorethan76.WerejecttheNashproportionofLeftplayfor13(12)oftheRowsubjectsduringtheinitial(ﬁnal)100repetitions,andwerejecttheNashproportionfor21(20)oftheColumnsubjectsduringtheinitial(ﬁnal)100repetitions.

Next,weevaluatewhetherthesubjects’sequencesofactionchoicesareseriallyindependentviaanonparametricrunstest.Thez-teststatistichasadistributionapproximatetothestandardnormalandisafunctionofthesequencelengthR,andthenumberofLeftandRightsequences,rLandrR,respectively.Itsvalueis

rL+rR−z=󰀈󰀁2rLrRR1−2󰀂.

2rLrR(2rLrR−R)

R2(R−1)Thenullhypothesisofthetestisthatasubject’schoicesareindependentrealizationsofabinomialrandomvariable.Weconductatwo-tailedtest.Rejectionsfromlargervaluesoftheteststatisticindicatetoomanyruns,andaresymptomaticofnegativeserialcorrelation.Rejectionsfromsmallervaluesindicatetoofewruns,andaresymptomaticofpositiveserialcorrelation.FortheRowplayers,werejectserialindependencefor10subjectsintheﬁrsthalfofthesample,andonly4subjectinthesecondhalf.FortheColumnplayers,thenumberofrejectionsis14and10fortheﬁrstandsecondhalf,respectively.ThereisanotablebiaswithrespecttotheColumnplayers;22outof24oftherejectionscomefromzscoresthataretoonegativeandindicatestrongpositiveserialcorrelation.ThisisconsistentwiththeresultsfoundbyRosenthaletal.(2003)intheoriginalstudyofthePursue-Evadegame,butatypicalforotherstudieswhichoftenﬁndnegativeserialcorrelation.

Table3presentsasimilardatasummaryfortheindividualsubjectsoftheGamble-Safetreatment.Inthiscase,theNashequilibriummixedstrategyisequiprobable,andthecriticalregionsofthetwo-sidedbinomialtestsare39orlessand60ormoreLeftactionchoices.FortheRowplayers,theNashhypothesisisrejectedfor16subjectsintheﬁrst100repetitionsand15inthesecond100repetitions.CorrespondinglyfortheColumnplayers,theNashhypothesisisrejectedfor25subjectsintheﬁrsthalfofrepetitionsand21playersinthe

Table2:Pursue-Evadeindividualsubjectsummarydata.

RowPlayer

Rounds1-100

Pair1234567101112131415161718192021222324252627

eiz

ColumnPlayer

Rounds1-100LeftCount59n50n51n22n,e56n53n41nn63n,e39n,e46nn61en45n50n50n70e41n45n50n70e585887n,e62e41n

RunsStat−4.44i6.23i1.01−2.45i−1.281.85−0.49−3.58i−1.−3.08i−2.16i−2.57i0.72−2.16i−3.76i−2.01i−2.61i−0.72−0.08−3.15i0.40−1.68−1.391.71−2.98i1.04−2.57i

Rounds101-200LeftCount45n53n71e13n,e44n40n76e52n76e29n,e31n,en48n37n,e41nn20n,e52n22n,e25n,e62e47n78n,e66e100n,e5927n,e

RunsStat−3.35i3.47i0.69−2.08i−0.261.47−2.35i0.22−2.35i−3.96i1.70−3.78i−1.39−1.−5.90i−1.55−3.16i−1.390.49−6.86i1.470.−1.86−1.—z1.58−1.90

RunsStat0.20−0.20−0.60−1.22−0.351.81−0.58−0.191.18−1.73−1.120.940.96−0.36−1.091.39−1.−2.51i1.00−0.09−0.45−0.332.09i−0.24−2.67i−0.82−2.55i

Rounds101-200LeftCount71e66e85e65n,e63e37n,e68e48n84n,e55n55n75e5973e66e88n,e73e40n65e78n,ee84n,e60e43n57n75e

LeftCount77n,e67e77n,e49n5939n,e62e47n51n65e48n63e60e78n,e68e70e46n51n80n,e51n68eee42n76e45n

RunsStat−2.40i1.32−0.12−1.601.170.931.040.44−4.82i−4.53i2.63i−2.29i−1.26−1.110.792.19i0.72−2.16i2.02i2.21i0.61−0.12−0.240.42−0.970.42−6.19i

Two-sidedbinomialtestrejectionoftheNEproportionof2/3atthe5%levelofsigniﬁcance.Two-sidedbinomialtestrejectionofequiprobableproportionatthe5%levelofsigniﬁcance.Runstestrejectionofserialindependenceatthe5%levelofsigniﬁcance.Missingvaluesduetoinapplicabilityoftestondatawithzerovariation.

secondhalfofrepetitions.Also,weseethat9Columnplayersubjectsalmostexclusivelyplaythepureminimaxstrategy(over90times)inthelast100repetitions,whilethereisonlyonesuchRowplayer.Further,weﬁndevidenceofserialcorrelationinmanyindividuals’choicesequences.FortheRowplayers,werejectserialindependencefor12and9subjectsintheﬁrstandlast100repetitions,respectively.FortheColumnplayers,serialindependenceisrejectedfor12subjectsintheﬁrsthalfofrepetitionsand5subjectsinthesecondhalfofrepetitions.

Table3:Gamble-Safeindividualsubjectsummarydata.

RowPlayer

Rounds1-100

Pair123456710111213141516171819202122232425262728

ColumnPlayer

Rounds1-100LeftCount24m19m36m3m26m70m15m63m33m36m62m24m415114m11m73m34m4m8m2m20m12m65m39m38m30m

RunsStat−2.07i−0.91−3.29i−1.0.66−4.80i1.39−1.86−2.78i−1.33−1.730.70−1.33−4.42i1.230.22−2.15i−0.872.09i−2.33i−3.30i0.240.000.42−2.10i−0.12−2.37i−2.i

Rounds101-200LeftCount14m12m440m31m599m65m19m29m5224m19m498m9m598m62m2m4m0m32m32m4031m446m

RunsStat−0.880.42−5.36i

—z2.41i−0.49−0.24−0.77−3.i−1.271.02−0.691.06−0.200.901.02−0.080.900.830.240.44—z−1.281.04−2.51i−1.60−1.49−3.03i

RunsStat3.261.21−1.08−0.060.97−2.57i−1.43−2.50i−6.41i−3.25i−1.244.27i−2.59i2.19i−0.470.131.81−6.50i−1.58−0.670.96−1.−0.32−0.701.010.30−2.62i0.21

Rounds101-200LeftCount39m5176m41m63m5626m77m76m5315m444441504m28m38m5937m20m594963m24m60m

LeftCount434360m4072m504160m16m39m72mm33m32m4135m31m5668m4620m69m4668m80m63m

RunsStat0.61−3.29i−0.63−1.050.17−1.810.13−3.77i−2.22i−0.123.42i3.48i−3.46i−0.81−1.0.961.17−4.75i−0.−2.10i−0.35−2.77i−2.21i0.05−0.742.42i−4.74i−0.13

Two-sidedbinomialtestrejectionofequiprobableproportionatthe5%level.Runstestrejectionofserialindependenceatthe5%levelofsigniﬁcance.Missingvaluesduetoinapplicabilityoftestondatawithzerovariation.

4ResultsoftheHMMstatisticalanalysis

InthissectionwepresenttheestimatedHMMsforthePursue-EvadeandGamble-Safetreat-ments.Firstwereportthemeansandvariancesoftheposteriordistributionsofthetransitionprobabilitymatricesandtheinitialdistributionsoverstates.Theestimatesreﬂectadoptionofbothpureandmixedstrategiesandcharacterizetheswitchingbetweenlatentstrategies.We

thenusetheseestimatestogenerateadescriptionofthedynamicsofthelatentmixedstrategyevolution.Finally,weprovideanassessmentoftherobustnessofsomeofourassumedpriors.

4.1Pursue-Evadegame

ForthePursue-Evadegame,werestrictthelatentstatespaceStocontainfourelements.WetreatthecorrespondingvectorofstatedependentmixedstrategiesBasﬁxedandknown,andthefourelementsarethepureRightstrategy(PR),thefocalequiprobablemixedstrategy(EM),theNashequilibriumstrategy(NE)oftwo-thirds,andthepureLeftstrategy(PL).Speciﬁcally,weassumeapointpriorofB=(0,0.5,0.67,1).UsingthispointpriorweestimatetheHMMusingtheMCMCmethod.

WerunthetheGibbssamplerfor10,000iterations.Usingthelast5000iterations,weestablishthattheempiricaldensityfunctionshaveconvergedbyapplyingtheGeweketest(Geweke,1991).Thenweusetheselast5000iterationstomakestatisticalinferences.Table4presentstheestimatedmeansandstandarddeviationsofthetransitionprobabilitiesbetweenstates,thesamefortheinitialprobabilitiesoverstateposteriors,andthecalculatedlimitingdistributionsoftheMarkovchainsforbothRowandColumn.

Table4:Estimatedtransitionmatrices,initialandlimitingdistributionsofPursue-Evadegame

RowPlayer

PRt+1PRtEMtNEtPLt

LimitingDistribution

0.75(0.038)0.025(0.009)0.005(0.002)0.022(0.011)0.082(0.063)0.050

EMt+10.145(0.037)0.95(0.012)0.007(0.003)0.023(0.011)0.614(0.13)0.274

NEt+10.051(0.024)0.013(0.006)0.939(0.014)0.218(0.035)0.193(0.12)0.8

PLt+10.0(0.024)0.013(0.006)0.05(0.013)0.737(0.034)0.111(0.08)0.128

PRt+10.752(0.029)0.095(0.028)0.012(0.008)0.039(0.017)0.043(0.040)0.178

ColumnPlayerEMt+10.204(0.03)0.879(0.033)0.021(0.014)0.034(0.016)0.161(0.127)0.385

NEt+10.025(0.015)0.019(0.012)0.96(0.022)0.031(0.015)0.735(0.141)0.356

PLt+10.018(0.01)0.007(0.005)0.007(0.005)0.6(0.026)0.061(0.057)0.080

Note:standarddeviationsareinparentheses.

OurestimationoftheinitialdistributionoverstatesispresentedinthetheﬁfthnumericrowofTable4.Forbothrolesweﬁndinitialplayhasahighrateofmixedstrategyplay.RowplayerspredominatelyfollowtheEM(61%),whiletheColumnplayerspredominantly

followtheNE(74%).Interestingly,thisisquitediﬀerentfromthelimitingdistributionoftheestimatedtransitionmatrices,whichwecaninterpretasthelongrunsteadystateoftheHMM.FortheRowplayer,themodeofthelimitingdistributionistheNE(55%),whilefortheColumnplayerbothEMandNEareroughlyequallylikely,withprobabilitiesof39%and36%,respectively.Clearlythereismovementofstrategyadoptionovertime.

Someaspectsofthesedynamicscanbeseenbyinspectionoftheestimatedtransitionprob-abilities,givenintheﬁrstfournumericrowsofTable4.Largevaluesonthemaindiagonalsandcorrespondingsmallvaluesontheoﬀ-diagonalsindicatestronginertiainadoptingnewstrategies.Therearesomeinterestingpatternswhenthereisatransitionbetweenstrategies.ConsidertheRowplayersﬁrst.WhenswitchingawayfromPRaplayerisalmostthreetimesaslikelytoswitchtoEMthaneitheroftheothertwostrategies.Likewise,whenswitchingawayfromEMaplayeristwiceaslikelytoswitchtoPRthaneitheroftheotherstrate-gies.There’sasimilarprobabilisticcyclebetweenNEandPLwithmuchlargerswitchingprobabilitiesbetweenthem.ThedynamiceﬀectsofthesecyclingtendenciescanbeseeninFigure4,whichpresentstimeseriesoftheestimatedproportionofsubjectsusingeachofthefourstrategies.7ThePLandtheNEseriestendtomirroroneanother,asdothePRandEMstrategies–albeitwithmorenoise.

TheresultsfortheColumnplayersintherighthandsideofFigure4arequitediﬀerent.TheuseofNEsteadilydeclineswhiletheadoptionofEMrisesintheﬁrst50repetitions.Fur-thermore,weseeaslowemergenceofPRoverthecourseoftheexperiment.TheprobabilisticcyclebetweentheEMandPRstrategiesisevidentbytheirsharpmirroringpattern.

Row

1.0NEPLPREM

Column

1.0NEPLPREM

0.8ProbabilitiesProbabilities0

100Periods

150

200

0.60.40.20.00.00

0.20.40.60.850100Periods

150200

Figure4:StrategydynamicsinPursue-Evadegame

NextweassesstheappropriatenessofourdegenerateprioronBbyconductingtheMCMC

7127·5000Forstrategyjtheestimatedproportionofsubjectsusingthatstrategyinagivenroundtisˆjt=󰀅10000󰀅27

.l=5001i=1Isli,t=j

estimationusingauniformBetaprior,β(Bj;1,1),foreachofthestatedependentmixedstrategies.Wethensamplefromtheposteriordistributionstoconstructanempiricaldensityfunctionforeachofthestatedependentmixedstrategies.InFigure5wepresentkernelsmoothedplotsoftheseapproximationstoposteriordensities.InspectionrevealsfortheRowplayertheposteriorsaresharplypeakedandcloselycenteredonourassumedfourstrategies,exceptfortheNEandtheposteriorwithamodecloseto3/4insteadof2/3.FortheColumnplayerweseethreeoutoffourposteriorscoincidewithourassumedset.TheonediﬀerenceisthePRandtheposteriorwithamodeofabout0.15.

Row

60505060Column

40DensityDensity0.0

0.2

0.4

0.6

0.8

1.0

302010000.0

102030400.20.40.60.81.0

Probability to Choose LProbability to Choose L

Figure5:PosteriordistributionofBinPursue-Evadegame

4.2TheGamble-Safegame

WenowturnourattentiontotheGamble-Safegame.Here,werestrictthelatentstatespaceStocontainthreeelements.InourestimationwetreatBasﬁxedandconsistingoftheelementsPR(theminimaxstrategy),EM(whichisalsotheNEstrategy),andPL.WeusethesameparametersfortheGibbsSamplerasweusedinanalyzingthePEdata.

ForboththeRowandColumnplayerdatasetswerantheGibbsSamplerfor10,000iterations,usingthelast5000iterationsforinferenceaftertestingforconvergenceoftheempiricaldensitieswiththeGeweketest.TheposteriormeansandstandarddeviationsarereportedinTable5.ComparingtheestimatedinitialdistributionπtothelimitingdistributionsuggeststhataninitialhighprobabilityofthemixedNashstrategyplayreducesovertimeforbothplayerroles.ThechangefortheColumnplayerismoredramaticasEMgoesfrom60%to40%,andthatreductioncorrespondstoariseintheminimaxstrategyPRfrom34%to53%.

IncontrasttothePEgame,thereisasegregationbetweenmixedstrategyandpure

strategyfollowers.EvidenceofthisisfoundintheestimatedMarkovtransitionmatricesaswecanseetheyalmostfailtobeirreducibile(roughlymeaningwecanalwaysreachonestatefromanother,evenifittakesmultipletransitions).TheprobabilityofcontinuingintheEMstateisnearlyone,indicatingthatonceasubjectfollowsthemixedstrategyheislikelytodosoforalargenumberofrepetitions.PurestrategyadoptersexhibitquitediﬀerentpatternsdependinguponwhethertheyareintheRoworColumnrole,inparticularwithrespecttoswitchingtendenciesinthePLstate.FromthePLstate,RowplayerstransitiontoPRwith26%probability,whilethistransitionprobabilityis79%forColumnplayers.

Table5:Estimatedtransitionmatrices,initialandlimitingdistributionsofGamble-Safegame

RowPlayer

PRt+1PRtEMtPLt

LimitingDistribution

0.815(0.027)0.003(0.011)0.260(0.031)0.169(0.084)0.203

EMt+10.010(0.020)0.988(0.021)0.043(0.031)0.779(0.094)0.660

PLt+10.175(0.016)0.009(0.011)0.697(0.037)0.052(0.046)0.137

PRt+10.1(0.010)0.007(0.013)0.791(0.031)0.337(0.098)0.527

ColumnPlayer

EMt+10.006(0.010)0.985(0.019)0.042(0.031)0.596(0.103)0.404

PLt+10.103(0.007)0.008(0.008)0.167(0.044)0.067(0.052)0.059

Note:standarddeviationsareinparentheses.

Figure6presentsthetimeseriesoftheestimatedproportionofsubjectsusingeachofthethreelatentstrategies.HereweseetheimpactoftheMarkovtransitionprobabilitiesthatleadtoinertiaofthemixedstrategystateandalsothestrongcyclingtendenciesofplayersbetweentheLeftandRightpurestrategies.IntheRowplayerﬁgure,weseetheEMstrategyproportionhasasmoothpaththatdropsquicklyfromitsinitialleveltoitslimitingvaluewithintheﬁrst50repetitions,afterwhichitremainsrelativelyconstant.Wealsoseetheraggedmirroringpattern,indicatingswitchingbetweenthePRandPLstrategies.WeseesimilarfeaturesintheColumnﬁgureexceptthattheEMshowsamoregradualdecline,andPRshowsacorrespondinggradualincrease.ThisleadstotheseparationofthePRandPLstrategiesandallowsustoseetheclearshortrunswitchingbetweenthesestrategiescharacterizedbythejaggedmirrorrelationshipbetweentheirrespectiveseries.

WetesttherobustnessofourpointpriorB=[0,0.5,1],byestimatinganHMMforwhichthesestateconditionalstrategieseachhaveauniformBetaprior.ThekernelsmoothedempiricaldensityfunctionsoftheposteriorsarepresentedinFigure7forbothRowand

Row

1.00.80.8NEPLPR

Column

1.0NEPLPR

0.60.40.20.0050100Periods

150200

0.00

0.20.40.650100Periods

150200

Figure6:StrategydynamicsinGamble-Safegame

Columnplayers.FortheRowplayer,thelowerandmiddleposteriorsareclosertogetherthanourassumedsets.FortheColumnplayer,theposteriorsofthelowertwostatedependentstrategiesareshiftedtotherightofourassumedones.Weconjecturetheseshiftscouldcomefromerroneouslyassumedhomogeneityofthestrictlymixedstrategyusedbysubjects.AnalternativewouldbetoincreasethenumberofelementsinSortomodeltheindividuals’strictlymixedstrategiescomingfromahierarchicalprocess.

Row

5050Column

4030Density20Density0.0

0.2

0.4

0.6

0.8

1.0

10000.0

102030400.20.40.60.81.0

Probability to Choose LProbability to Choose L

Figure7:PosteriordistributionofBinGamble-Safegame

4.3Forecastingrealizedactions

Untilnowourprimaryconcernhasbeentheestimationofwhensubjectsadoptpureandmixedstrategies,andourHMM’sfunctionhasbeentoprovideastatisticalframeworktotest

theoriesaboutlatentstrategychoice.NowweexplorethepotentialoftheHMMtopredictactionstaken;avaluablecapabilityinwidespreadapplicationsfromstrategicmaneuversinmilitaryengagements,toknowingwhenapokerplayerisbluﬃng.

WeﬁrstconsiderhowwelltheestimatedHMMscoincidewiththeobservedproportionsofLeftplayinourexperimentaldataset.Forthisforecastingexerciseoftheexperimentalpaneldatasetwecalculate,foreachgameandrole,thepredictedproportionofLeftplaybytheM

󰀁t,bysubjectsinperiodt,Left

1󰀆󰀆󰀆󰀁(Il)βj.Leftt=

M·Ll=1d=1j=1sd,t=j

HereListhelengthofsequenceoftheGibbssamplerweuseforstatisticalinference.Forourdatasetsthissequenceisiterations5001to10000.Figure8presentsplotsofthetimeseriesofthepredictedandactualproportionsofLeftplay.Inallfoursettingsthepredictionstrackthetrendsintheactualdata.Admittedlythisisanin-sampleforecastingexercise,butnonethelessstillimpressive,asminimizingforecasterrorisnottheobjectiveofourstatisticalinferenceexercise.

Out-of-sampleforecastingisofmorepracticaluseandwecanusetheHMMforthispurposeaswell.Weestimate,with10,000iterationsoftheGibbssampler,theHMMforbothpointanduniformBetapriorsonBfortheﬁrst100repetitionsandusetheseestimatestomakeone-step-aheadforecastsofthelast100repetitions.LetΨ=(P,π,B,si,t)10000j=5001denotetherealizeddrawsoftheGibbssamplerforthelast5000iterationsoftheMCMCalgorithmthatareusedforstatisticalinferencefortheuniformBetapriorHMM.Thepredictivedensityofsi,tisobtainedbysimulationfromthejointposteriorsampleΨasfollows:

sˆi,t∼p(si,t|P[j],s˜i,t−1),j=5001,...,10000.

[j]

(1)

Wecanusethesesampledstatesforsubjectitogeneratethefollowing5000drawsfrom

thefollowingmarginalposteriorsample

yˆi,t∼p(yi,t|sˆi,t,B[j]),j=5001,...,10000.

[j]

(2)

Theaverageofthe5000drawsmadeaccordingtoEquation2,denotedyˆi,t,isthepredictionof

yi,t.Nextweuseyi,ttogeneratetheposteriordensitys˜i,tbyBayes’Rule.ThisissubstitutedintoEquation1tostarttheprocessofgeneratingthepredictionofyi,t+1.Toassesstheaccuracyofourforecastoftheholdoutsample,wecalculateandreporttheLog-likehood

PEHuman-Row

1.0Estimated ProportionEmpirical Proportion

PEHuman-Column

1.0Estimated ProportionEmpirical Proportion

0.8Proportion of Left PlayProportion of Left Play0

100Periods

150

200

0.60.40.20.00.00

0.20.40.60.850100Periods

150200

GSHuman-Row

1.0Estimated ProportionEmpirical Proportion

GSHuman-Column

1.0Estimated ProportionEmpirical Proportion

0.8Proportion of Left PlayProportion of Left Play0

100Periods

150

200

0.60.40.20.00.00

0.20.40.60.850100Periods

150200

Figure8:ActualandforecastedproportionsofLeftplay

statistic

LL(y|Ψ)=

200󰀆M󰀆t=101i=1

ln[I󰀍yi,t󰀎p(ˆyi,t)+(1−I󰀍yi,t󰀎)(1−p(ˆyi,t))].

WealsoreporttheAkaikeinformationcriterionstatistic,whichisAIC(y,yˆ)=−2·(LL−

numberofmodelparameters).

Inordertoevaluatetheabilityofalternativemodelstopredictthefutureactionsingames,wecomparetheperformancesofone-step-aheadforecastingoftheHMMsofpointanduniformBetapriors(UHMM)onBagainstthealternativesoftheNashequilibriumstrategyandindividual-speciﬁcmixedstrategies(IM)whichareestimatedbyeachsubject’sproportionofLeftplayintheﬁrst100repetitions.

Wesummarizetheout-of-sampleforecastingperformanceforeachofthefourmodelsinTable6.First,fortheRowplayersinbothgametreatmentsthetwoHMMsoutperformthetwoothermodelswhenwedoanddonotpenalizeforthenumberofparameters.Forthe

holdoutsampleoftheColumnplayersandnotpenalizingforthenumberofparameters,theIMmodelperformscomparabletothetwoHMMmodelsinthePursue-Evadegame,andtheIMmodelperformscomparabletotheUHMMmodel(bothofwhichoutperformtheHMM)intheGamble-Safegame.However,whenwepenalizeforincreasingnumbersofparametersweseetheUHMMclearlyoutperformstheIMmodel.Thissuggeststhatourhomogeneousdynamicmodelperformswellonforecastingapopulationofgameplayers,butalsosuggeststhatallowingformoreindividualheterogeneitycouldleadtoevenbetteroutofsampleforecastingperformance.

Table6:Outofsampleforecastingperformance

RowPlayer

TreatmentP-EGameG-SGame

StatisticLoglikAICLoglikAIC

NE−17483497−19413884

IM−17363526−19163887

HMM−17233475−188737

UHMM−17103458−18653751

NE−20584117−19413885

ColumnPlayerIM−17753605−14292915

HMM−17763583−15173050

UHMM−17713580−143624

5Discussion

WehaveintroducedaHMMforthedetectionofpureandmixedstrategyplayinrepeatedgames.WethenappliedthismodeltodatafromanewexperimentinwhichhumansubjectsrepeatedlyplayagainstcomputeropponentsthatwereprogrammedtoplaytheirpartofthemixedstrategyNashequilibrium.Weﬁndthatsubjectsdoplaybothpureandmixedstrategies,andswitchbetweentheseoverthecourseofplay.Further,weﬁndthereisnon-stationarityinthedistributionoflatentstrategiesovertime.WeobservealargemovementfromtheinitialdistributionoverstrategiestothoseofthelimitingdistributionoftheHMM.However,whilethelimitingdistributionassignsprobabilitytothesubjects’NEstrategy,theassignedprobabilityislessthan1.Thus,forourdata,weshowthatamixedstrategyNashequilibriumisonlypartiallyself-enforcing.Thisisanewresultinbehavioralgametheory,aspreviousstudieshaveonlyconsideredthecompositehypothesisthatmixedstrategyequilibriaarebothself-enforcingandalsothelimitpointofthesubjects’learningprocess.

Ourprimaryinteresthasbeenmodelingapopulationofplayersinteractinginagamewithknownpayoﬀs,howeverthereareseveralnaturalextensionstoourapproach.First,wecouldfocusonthemodelingandforecastingofasinglesubjectfromthepopulation.Todothis,welikelyneedtoallowmoreindividualheterogeneityintheHMM.Aﬁrststepwouldbetoalloweachplayertohaveasetofindividual-speciﬁcstrictmixedstrategiestofollow.This

coulddonebyallowingindividualstatedependentmixedstrategiesBis,orbymodelingtheseBisascomingfromahierarchalstructurecharacterizedbyasmallsetofhyperparameters.Asecondextensionistomodelstrategicsituationsintheﬁeld,inwhichthegamepayoﬀsarenotknownbecauseofunobservedindividualheterogeneity.Forexample,soccerplayersmakingpenaltykicksmayvaryintheirstrengthofkickingleftorright,andsimilarlygoaliesalsomayhaveunobservablediﬀerencesindefendingkickstotheleftandright.Insuchcases,theHMMcanhelpidentifysuchpayoﬀsandalsodescribetheplayers’learningprocessregardingtheselatentpayoﬀtypes.

TheHMMaspresentedinthispaperiscurrentlymoreofastatisticaldescriptionthanabehavioralmodelderivedfromoptimizingbehavior.Tobecomesuchabehavioralmodel,thetransitionprobabilitiesmustbecomeanendogenousfunctionofaplayer’sexpectedpayoﬀsforthediﬀeringlatentstrategychoices.Onepossibleapproachistoallowaplayertoformbeliefsaboutanopponent’sactionandthenbestrespond.Theissuehereisthatinanexpectedutilityworldamixedstrategyisneverastrictbestresponse.However,ifonetakestheapproachthatuncertaintyaboutanopponent’sactionisambiguous–i.e.,aplayerdoesn’thavetheabilitytoformauniqueprior–thenanambiguity-averseplayermaystrictlypreferamixedstrategyoverpurestrategies.

References

Aumann,RobertJ.(1985),“Onthenon-transferableutilityvalue:AcommentontheRoth-Shaferexamples.”Econometrica,53,667–678.

Bareli,M,OAzar,IRitov,YKeidarlevin,andGSchein(2007),“Actionbiasamongelitesoccergoalkeepers:Thecaseofpenaltykicks.”JournalofEconomicPsychology,28,606–621.

Binmore,Ken,JoeSwierzbinski,andChrisProulx(2001),“Doesminimaxwork?anexperi-mentalstudy.”EconomicJournal,111,445–.

Bloomﬁeld,Robert(1994),“Learningamixedstrategyequilibriuminthelaboratory.”JournalofEconomicBehavior&Organization,25,411–436.

Chiappori,Pierre,StevenLevitt,andT.Groseclose(2002),“Testingmixed-strategyequilibriawhenplayersareheterogeneous:Thecaseofpenaltykicksinsoccer.”AmericanEconomicReview,92,1138–1151.

Geman,StuartandDonaldGeman(1987),“Stochasticrelaxation,Gibbsdistributions,andtheBayesianrestorationofimages.”InReadingsincomputervision:issues,problems,

principles,andparadigms(MartinA.FischlerandOscarFirschein,eds.),5–584,MorganKaufmannPublishersInc.,SanFrancisco,CA,USA.

Geweke,John(1991),“Evaluatingtheaccuracyofsampling-basedapproachestothecalcula-tionofposteriormoments.”StaﬀReport148,FederalReserveBankofMinneapolis.Greiner,Ben(2004),“Anonlinerecruitmentsystemforeconomicexperiments.”InForschungundwissenschaftlichesRechnen(KurtKremerandVolkerMacho,eds.),volume63ofGes.furWiss.Datenverarbeitung,79–93,GWDGBericht.

Morgan,JohnandMartinSefton(2002),“Anexperimentalinvestigationofunproﬁtablegames.”GamesandEconomicBehavior,40,123–146.

Nash,John(1951),“Non-CooperativeGames.”TheAnnalsofMathematics,,286–295.Noussair,CharlesandMarcWillinger(2011),“Mixedstrategiesinanunproﬁtablegame:anexperiment.”Workingpapers,LAMETA,UniverstiyofMontpellier.

Nyarko,YawandAndrewSchotter(2002),“Anexperimentalstudyofbelieflearningusingelicitedbeliefs.”Econometrica,70,971–1005.

Ochs,Jack(1995),“Gameswithunique,mixedstrategyequilibria:Anexperimentalstudy.”GamesandEconomicBehavior,10,202–217.

O’Neill,Barry(1987),“Nonmetrictestoftheminimaxtheoryoftwo-personzero-sumgames.”ProceedingsoftheNationalAcademyofSciences,U.S.A.,84,2106–2109.

Palacios-Huerta,Ignacio(2003),“Professionalsplayminimax.”ReviewofEconomicStudies,70,395–415.

Rabiner,LawrenceR.(19),“Atutorialonhiddenmarkovmodelsandselectedapplicationsinspeechrecognition.”ProceedingsoftheIEEE,77,257–286.

Rosenthal,RobertW.,JasonShachat,andMarkWalker(2003),“Hideandseekinarizona.”InternationalJournalofGameTheory,32,273–293.

Selten,ReinhardandThorstenChmura(2008),“Stationaryconceptsforexperimental2x2-games.”AmericanEconomicReview,98,938–966.

Shachat,Jason(2002),“Mixedstrategyplayandtheminimaxhypothesis.”JournalofEco-nomicTheory,104,1–226.

Shachat,JasonandJ.ToddSwarthout(2004),“Dowedetectandexploitmixedstrategyplaybyopponents?”MathematicalMethodsofOperationsResearch,59,359–373.

VonNeumann,John(1928),“Zurtheoriedergesellschaftsspiele.”MathematischeAnnalen,100,295–320.

VonNeumann,JohnandOskarMorgenstern(1944),TheoryofGamesandEconomicBehav-ior.PrincetonUniversityPress.

Walker,MarkandJohnWooders(2001),“MinimaxplayatWimbledon.”AmericanEconomicReview,91,1521–1538.

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文