您好,欢迎来到爱go旅游网。
搜索
您的当前位置:首页A hidden Markov model for the detection of pure and mixed strategy play in games

A hidden Markov model for the detection of pure and mixed strategy play in games

来源:爱go旅游网
AhiddenMarkovmodelforthedetectionofpureand

mixedstrategyplayingames∗

JasonShachat†J.ToddSwarthout‡

LijiaWei§July7,2012

Abstract

Weproposeastatisticalmodeltoassesswhetherindividualsstrategicallyusemixedstrategiesinrepeatedgames.WeformulateahiddenMarkovmodelinwhichthelatentstatespacecontainsbothpureandmixedstrategies,andallowsswitchingbetweenthesestates.WeapplythemodeltodatafromanexperimentinwhichhumansubjectsrepeatedlyplayanormalformgameagainstacomputerthatalwaysfollowsitspartoftheuniquemixedstrategyNashequilibriumprofile.Estimatedresultsshowsignificantmixedstrategyplayandnon-stationarydynamics.Wealsoexploretheabilityofthemodeltoforecastactionchoice.

JELclassification:C92;C72;C10

Keywords:MixedStrategy;NashEquilibrium;Experiment;HiddenMarkovModel

Thispapersupersedesthepreviousworkingpaper,“ManversusNash:Anexperimentontheselfenforcingnatureofmixedstrategyequilibrium.”†

WangYananInstituteforStudiesinEconomicsandMOEKeyLaboratoryinEconometrics,XiamenUniversity.jason.shachat@gmail.com‡

DepartmentofEconomicsandExperimentalEconomicsCenter,GeorgiaStateUniversity.swarthout@gsu.edu§

WangYananInstituteforStudiesinEconomicsandMOEKeyLaboratoryinEconometrics,XiamenUniversity.ljwie.wise@gmail.com

1

1Introduction

GametheoryandtheNashequilibriumsolutionconceptareakeyframeworkinthesocialsciencesformodelinginteractivebehavior.Theformulationofanormalformgameconsistsofasetofplayers,asetofpossibleactionsforeachplayer,andapayofffunctionforeachplayerthatgivesareal-valuedpayoffforanypossiblejointactionprofile–alistofactionsconsistingofoneforeachplayer.ANashequilibriumisajointactionprofilesuchthateachplayer’sassignedactionresultsinatleastashighapayofftotheplayerasanyotherpossibleaction,assumingallotherplayerschoosetheirrespectiveactionsintheNashequilibriumprofile.Ifplayersarerestrictedtodeterministicallychooseanaction,thentherearemanygamesthatdon’thaveaNashequilibrium,suchasthechildhoodgameofRock,Scissors,Paper.Confrontedwiththisproblem,VonNeumann(1928)generalizedaplayer’sdecisionfromchoosinganactiontochoosingaprobabilitydistributionoverhispossibleactions.1Thischoiceofaprobabilitydistributioniscalleda“mixed”strategy,andadegeneratemixedstrategywhichchoosesaparticularactionwithprobabilityoneiscalleda“pure”strategy.Theintroductionofmixedstrategiesallowsforexistenceofequilibriumacrossabroadclassofgames:fromminimaxsolutionsforzero-sumgames(VonNeumann,1928;VonNeumannandMorgenstern,1944)tononcooperativeequilibriaforn-persongames(Nash,1951).Whiletheroleofmixedstrategiesindefininglogicallyconsistentsolutionconceptsisnotindoubt,thepositiveaspectofindividualsactuallyplayingmixedstrategiesisanopenquestionofconsiderableinterest.

Researchers’effortstoanswerthisquestionhavenaturallyfocusedonsettingswheretheuseofmixedstrategiesismostcompelling:therepeatedplayofgameswhichhaveauniquemixedstrategyNashequilibrium.Thevalueof“beingunpredictable”isreadilyseeninexamplessuchasservesintennis,“bluffing”inpoker,andwhetherornotataxauthorityauditsataxpayer.Acommonapproachinthisliteratureistotestwhethertheplayers’actionchoicesareconsistentwiththemixedstrategyequilibrium.Somestudiesusingcontrolledexperimentswithhumansubjectshavetheadvantageofknowingthepayofffunctions,andtestwhetherchoicefrequenciesagreewiththeequilibriumstrategiesandwhetherplayers’sequencesofactionsareseriallyindependent(O’Neill,1987;Binmore,Swierzbinski,andProulx,2001;MorganandSefton,2002;SeltenandChmura,2008).Otherstudiesconsiderhigh-levelsportscompetitions,suchassoccer(Chiappori,Levitt,andGroseclose,2002;Palacios-Huerta,2003;Bareli,Azar,Ritov,Keidarlevin,andSchein,2007)andtennis(WalkerandWooders,2001),withtheadvantageofstudyinghighlyexperiencedplayerscompetingforhighstakesand

Alongwithgeneralizingthesetoffeasibleactionstothesetofmixedstrategies,aplayer’spayofffunctionisextendedbysettingitsvaluetotheexpectedpayoffgivenaprofileofmixedstrategies,commonlyreferredtoastheexpectedutilityproperty.

1

2

thedisadvantageofunknownpayofffunctions.2Thesestudiesfocusontestingtheserialindependenceofactionchoiceandtheequilibriumimplicationofequalpayoffsacrossactionchoices.Someofthemostprominentandrecurringresultsforbothtypesofstudiesarethataggregatedactionfrequenciesacrossplayersagreewiththeequilibriummixedstrategiesbutindividualactionfrequenciesdonot,andformanyindividualsactionchoicesareseriallycorrelatedviolatingtheindependenceprediction.

Toreconciletheseissuesofserialcorrelationandheterogeneity,severalstudies(Ochs,1995;Bloomfield,1994;Shachat,2002;NoussairandWillinger,2011)conductlaboratoryexperi-mentsusingthesametypeofgamesbutdirectlyelicitmixedstrategiesbyobligatingplayerstoselectaprobabilitydistributionoveractions.3Elicitedstrategiesintheseexperimentsex-hibitvariousdistinctpatterns.Somesubjectschoosepurestrategiesalmostexclusively,somechoosestrictlymixedstrategiesalmostexclusively,andothersusebothtypesofstrategies–usuallyinlongsequences.Also,certainmixedstrategiesareoftenquitefocal,suchaschoosingequalprobabilityweightonasubsetofactionsratherthantheNashequilibriumproportions.Naiveinterpretationoftheseresultssuggestsacleardistinctionbetweenplaythatispurposelyunpredictableandplaythatisapurebestresponsetochangingforecastsofanopponent’saction(NyarkoandSchotter,2002).Amorecautiousinterpretationisthatsubjectsmayes-chewtherandomizingdeviceprovidedbytheexperimenterandinsteadinternallyrandomize,orperhapssubjectschoosestrictlymixedstrategiesduetotheexperimentereffectofthenovelelicitationmethod.Clearlyalessinvasivemethodtodetectmixedstrategyplaywouldbevaluable.

InthisstudyweproposeahiddenMarkovmodel(HMM)todetectwhetherobservedactionchoicesaretheresultofpureormixedstrategiesplayinrepeatedtwo-personfiniteactiongames.4Therearethreekeyideasinourformulation:(1)wetreatthestrategyaplayerfollowsasalatentstateandtheactionplayedastheobservableoutputfromthelatentstrategy;(2)thesetofpossiblelatentstatesisadiscretesubsetofallpossiblemixedstrategiescontainingpurestrategies,Nashequilibriumorminimaxstrategies,andfocalmixedstrategies;and(3)aplayerswitchesthelatentstrategyhefollowsaccordingtoafirstorderMarkovprocess.Wethendemonstratetheabilityofthemodelbyapplyingittoanewexperimentaldatasetwecollect.Inourexperiment,eachhumansubjectrepeatedlyplaysa2×2gameagainstacomputerplayerthatfollowsitsmixedstrategyequilibrium.Somesubjectsplayazero-Theactionsetsaretypicallycomprisedofsimpleactions,e.g.,{serveleft,serveright}and{defendleft,defendright}.Thepayoffsareassumedtobetheprobabilityofwinningthetaskandtheseprobabilitieswilldifferbaseduponboththecomparativeskillsbetweenplayersandtherelativestrengthsaplayerhasforeachaction.3

Forexample,Shachat(2002)adoptsagamewithfouractions,eachidentifiedbyadifferentcolor,foreachplayer.Eachplayermustfillaboxwith100cardsinanycombinationofthefourcoloredcardtypes,andthenonecardisselectedatrandomtodeterminetheactionplayed.4

SeeRabiner(19)foraclassicintroductiontohiddenMarkovmodels.

2

3

sumgameandothersanunprofitablegame.5TheestimatedHMMsrevealseveralinterestingresults,including:(1)significantamountsofbothpureandmixedstrategyplay;(2)thefocalequiprobablemixedstrategyisplayedmoreoftenthantheNashequilibriumstrategy;(3)lowtransitionprobabilitiesbetweenmixedandpurelatentstrategies;(4)dynamicadjustmentsinthetypesofstrategiesplayersfollowovertime;and(5)appreciableratesofbothmixedandpurestrategyplayinthelimitingdistributionsoftheHMMs(interpretedasthelongrunequilibriumofplay).WethenextendtheHMMfromastatisticalframeworkforevaluatinghypothesestooneforforecastingactionchoiceandassessitspredictiveaccuracy.

2AHMMofswitchingstrategies

ConsideranexperimentinwhichweobserveMpairsofsubjects,eachplayingTperiodsofthesame2×2normalformgame.Oftengameslikethisaredescribedbyatwo-by-twotable,andforfamiliaritypurposeswedenoteonesubject’splayerroleasRowandtheotherasColumn.Welabeleachplayerrole’stwopossibleactionsLeft(L)andRight(R),andexpressasubject’smixedstrategyastheprobabilityofplayingL.OfparticularinterestiswhenthegamehasasingleNashequilibriumanditisinstrictlymixedstrategies,althoughourframeworkisnotrestrictedtostudyonlysuchcases.Threefactorsconfoundingtheanalysisdatageneratedbythistypeofprocessarethelatencyofplayers’mixedstrategies,theheterogeneityofstrategyadoptionacrosssubjects,andvariationofadoptedlatentstrategiesoverthecourseofrepeatedplay.Inthissection,wepresentamodelthataccommodatesandallowsestimationoftheseconfounds.

ConsiderthefollowingHMMforafixedplayerrole.ThestatespaceSisann-elementsubsetofthesubjecti’spossiblemixedstrategies.Denotesi,t∈Sforthestrategyusedbysubjectiinperiodt,SiisthesetofallpossibleTelementsequencesofmixedstrategiesforiwithtypicalelementsi,andletsbethecollectionofsiforallMsubjectsinagivenplayerrole.Letyi,tdenotesubjecti’srealizedactioninperiodt,yiisthecorrespondingTelementsequenceofi’sobservableactions,andyisthecollectionofyiforallMsubjects.View{y,s}astheoutputoftheHMM.

TheprobabilitystructureoftheHMMhasthreeelements.First,then-elementvectorBforwhichtheelementBjistheprobabilityasubjectchoosesactionLeft,i.e.themixedstrategy,ifheisinstatej.WewillprovidetwoanalyseswhichdifferinhowwespecifyB.InoneapproachweconsiderBasknownapriori,andSandBareredundantnotation.Usually,inthisapproach,Bcontainsthetwopurestrategies,otherstrategiessuggestedbytheorysuch

AnunprofitablegameisoneinwhichtheminimaxandNashequilibriumsolutionsaredistinctbutyieldthesameexpectedpayoffforeachplayer.

5

4

asNashequilibriumorminimax,andotherfocalstrategies.InthesecondapproachwetreattheelementsofBasunknownparameters–thestatedependentmixedstrategies.Thesecondelementofthestructure,π,istheinitialmultinomialprobabilitydistributionoverS.Thethirdelement,P,isthen×ntransitionprobabilitymatrix.TheelementPjkistheprobabilityasubjectadoptsstrategykinperiodtconditionaluponhavingadoptedstrategyjinperiodt−1.

Thelikelihoodfunctionof(B,π,P)is

L(B,π,P|y,s)=Pr(y,s|B,π,P).

Rewritingthislikelihoodintermsofthemarginaldistributionsofyandsgivesus

L(B,π,P|y,s)=Pr(y|s,B,π,P)·Pr(s|B,π,P).

Next,weassumethatthemarginaldistributionofyconditionalonsisindependentofπandP.Inotherwords,oncethestateisrealizedthentheprobabilityofaLeftactionreliessolelyonthemixedstrategyofthecurrentstate.Also,bythespecificationoftheHMM,sisindependentofthestatedependentmixedstrategiesB.Thisallowsustorestatethepreviouslikelihoodfunctionas

L(B,π,P|y,s)=Pr(y|s,B)·Pr(s|π,P).Sincethesequenceofstatesforeachsubjectisunobservable,weevaluatethelikelihoodbyintegratingoverthesetofallpossiblesequencesL(B,π,P|y,s)=

M󰀆󰀇i=1s∈S

I󰀄yi,1󰀅

π(si,1)Bs(1i,1

−Bsi,1)

1−I󰀄yi,1󰀅

T󰀇t=2

I󰀄yi,t󰀅1−I󰀄yi,t󰀅

Psi,t−1,si,tBs(1−B),si,ti,t

whereI󰀍yi,t󰀎isanindicatorfunctionwhichequalsonefortheactionLeftandzeroforthe

actionRight.AsTgrows,thenumberofcalculationsneededtoevaluatethislikelihoodquicklybecomescomputationallyimpractical.WedescribetheBayesianapproachwetaketoestimatetheHMM,althoughonecouldproceeddownafrequentistpathofmaximizingtheexpectedlikelihoodfunctionusingsomevariationoftheEM(expectedmaximumlikelihood)algorithm.

IntheBayesiananalysis,wefirstfactorthejointposteriordistributionoftheunknownHMMparametersandunobservedstatessintotheproductofmarginalconditionalposteriordistributions.ThenweevaluatethesemarginalconditionalposteriorsthroughaniterativesamplingprocedurecalledtheMarkovChainMonteCarlo(MCMC)method.MCMCisa

5

simplebutpowerfulprocedureinwhichtheempiricaldistributionsofthesampledparametersconvergetothetrueposteriordistributions.Afterconvergence,iterativesamplingiscontinuedtoconstructempiricaldensityfunctions.TheseareusedtomakeinferencesregardingtheparametersofthehiddenMarkovmodels.

ConsidertheposteriordensityfunctionontherealizedunobservedstatesandHMMpa-rametersh(s,B,P,π|y).First,expressthisjointdensityastheproductofthemarginaldensityofHMMparametersconditionalontheobservedactionchoicesandunobservedstateswiththemarginaldensityofthestatesconditionaluponactionchoices

h(s,B,P,π|y)=h(B,P,π|s,y)h(s|y).

WehavealreadyassumedthatthetransitionmatrixPandinitialprobabilitiesoverstatesπareindependentoftheactionchoicesandstatecontingentmixedstrategiesB,whichallowsustostate

h(s,B,P,π|y)=h(B|s,y)h(P,π|s,y)h(s|y).ThisproductofthreeconditionalposteriorspermitsasimpleMarkovChainprocedureofsequentiallysamplingfromthesedistributions.WestartwithsomeinitialarbitraryvaluesfortheHMMparameters,(B(l),P(l),π(l))wherel=0.Wecreates(0)bysimulationusingP(0)andπ(0)withoutconditioningony.Fromtheseinitialparametervaluesandtheobservedactionsequencesy,weuseaGibbssamplingalgorithmtogenerateaninitialsampleofstatesequencess(1).ThenwemakearandomdrawP(1)fromtheposteriordistributionofPconditionalons(1)andy,andproceedsimilarlytomakearandomdrawofπ(1).WecompletetheiterationbymakingarandomdrawB(1)fromtheposteriorofBconditionalons(1)andy.ThekeytotheMCMCmethodisthatasl→∞,thejointandmarginaldistributionsofB(l),P(l),andπ(l)convergeweaklytothejointandmarginalposteriordistributionsoftheseparameters(GemanandGeman,1987).WenowdescribethedetailsofeachstepinaniterationoftheMCMCprocedure.

Step1:Samplingthestatesequencess(l)

WebeginbydescribingaGibbssamplingtechniqueforgeneratingdrawsfromthedistribu-tionofs(l)conditionaluponyand(B(l−1),P(l−1),π(l−1)).Theelementsofsicanbedrawnsequentiallyforeachtconditioningontheobservedactionchoiceyi,t,therealizedstateinotherperiods,π,andP.Letsi,=tbethevectorobtainedbyremovingsi,tfromthesequencesi.Givensi,=t,weexpresstheconditionalposteriordistributionofsi,tas

Pr(si,t|yi,t,B(l−1),P(l−1),si,=t,si,=t)∝Pr(yi,t|si,t,B(l−1))·Pr(si,t|P(l−1),si,=t,si,=t)

6

(l)

(l)

(l−1)

(l)

(l)

(l)

(l−1)

with

Pr(si,t|P(l−1),si,=t,si,=t)=Pr(si,t=k|P(l),si,t−1,si,t+1).

Consequently,theconditionalposteriorprobabilityofsi,t=kandt>1is

(l)Pr(si,t

(l)

(l)

(l−1)

(l)

(l−1)

Pr(yi,t|si,t=k,Bk)·Pr(si,t=k|P(l−1),si,t−1,si,t+1)

,=k|·)=󰀅n(l−1)(l)(l−1)

Pr(yi,t|si,t=j,Bj)·Pr(si,t=j|P(l−1),si,t−1,si,t+1)

j=1

(l−1)(l)(l−1)

andfort=1

(l)Pr(si,1

Pr(yi,1|si,1=k,Bk)·Pr(si,1=k|π(l−1),si,2)

=k|·)=󰀅.n(l−1)(l−1)

Pr(yi,1|si,1=j,Bj)·Pr(si,1=j|π(l−1),si,2)

j=1

(l−1)(l−1)

Thestatesi,tisdeterminedbymakingarandomdrawfromtheuniformdistributiononthe

(l)

unitinterval,andcomparingthisdrawtothecalculatedconditionalprobabilityofsi,t.

(l)

Step2:SamplingthetransitionmatrixP(l)andπ(l)

TheposteriordistributionsofPjkandπdependonlyupons(l)andthepriors.WespecifythepriorofπasaDirichletdistributionh(π;α1,...,αn)whereαj=1,for1≤j≤n.Similarly,wespecifythepriorofthejthrowofPasaDirichletdistributionh(pj1,...,pjn|ηj1,...,ηjn).Inanexperiment,werecordthedatafromthetruestartoftheHMMprocess,soweassumethatthejointposteriorissimplytheproductofthesetwomarginalposteriors.Therespectiveposteriorsofπ(l)andP(l)are

h(π|s)∝Pr(s|π)h(π;α1,...,αn),

and

h(Pj1,...,Pjn|s)∝Pr(s|Pj1,...,Pjn)h(Pj1,...,Pjn;ηj1,...,ηjn).

Ifν0jisthenumberincidencesofsi,1=jins(l),andνjkisthecountoftransitionsfromstatejtokins(l),thentheconditionalprobabilitiesinthetwoposteriorcalculationsaremultinomialdistributions

ν01n−1

h(π|s)∝π1...πn0−1

ν

(l)

󰀃󰀄ν0nn−1󰀆

·1−πkh(π;α1,...,αn)

k=1

7

and

jn−1

h(Pj1,...,Pjn|s)∝Pj1j1...Pjn−1

νν

󰀃󰀄νjnn−1󰀆

·1−Pjkh(Pj1,...,Pjn;η1,...,ηn).

k=1

SincetheDirichletdistributionistheconjugatepriorforthemultinomialdistribution,these

posteriordistributionsarealsoDirichletdistributionsforwhicheachshapeparameteristhesumofitspriorvalueandtherespectivecount

h(π|s)=h(π;α1+ν01,...,αn+ν0n)

and

h(Pj1,...,Pjn|s)=h(Pj1,...,Pjn;η1+νj1,...,ηn+νjn).

Hence,weselectπ(l)andP(l)betakingrandomdrawsfromthesedistributions.

Step3:SamplingthestatedependentmixedstrategiesB

Forourinitialapproachtomodelingthestatedependentmixedstrategies,weassumeBcorrespondstoaknownsubsetofS.InourBayesiananalysisthisisequivalenttoassumingapointprioronthesestrategies,andthereforethereisnoupdating.SoinourGibbssamplingprocedureweskipthisstep,andproceedtonextiterationoftheGibbssampler.Ofcoursethisisaratherstrongpriortoassume,andweshouldevaluatewhetheritisappropriate.Accordingly,weconductanauxiliaryanalysisinwhichweassumeauniformpriorofthesetofallmixedstrategies.

Intheauxiliaryanalysisweproceedasfollows.Thepriorsofstatedependentmixedstrate-giesB1,...,BnareassumedindependentofeachotherandoftheMarkovprocessgoverningthestates.Giventheseassumptions,wecanthinkofeachBjasaBernoulliprobability,andeachLeft(Right)actionasasuccess(failure)whenoccurringinstatej.Thelikelihoodfunctioniscalculatedasabinomialtrial.Sinceitistheconjugatepriorofthebinomial,weassumethepriorisaBetadistribution,denotedβ(Bj;ζj;γj).Wewantauniformprioraswell,andthatcorrespondstosettingtheshapeparametersζjandγjtoone.Theposteriordistributionissimply

h(Bj|y,s(l))=β(Bj;ζj+κL,j,γj+κR,j),

whereκL,jandκR,jarethenumberoftimestheactionsLeftandRight,respectively,arechosen

(l)

wheninstatejaccordingtos(l).ThestateconditionalmixedstrategiesBj,j=1,...,n,arerandomlydrawnfromtheseBetaposteriordistributions,completinganiterationofthe

8

Gibbssampler.

TheGibbssamplerisrunforalargenumberofiterationsuntiltheempiricaldistributionofalltheparametershasconverged(Geweke,1991).ThenthesamplingprocedureisallowedtocontinuetorunforanothernumberofiterationstobuildupanempiricaldistributionthatcorrespondstotheposteriordistributionoftheHMMparameters.Itisfromthisempiricaldistributionthatweconductstatisticalinferences.

3Theexperiment

WeapplyourHMMframeworktoanewexperimentaldatasetthatprovidesalikelysettingformixedstrategies,andparticularyNashequilibriumstrategies,tobeadopted.Additionally,ourproceduresallowustoestimateforoneplayerrolewithouttheneedtoalsosimultaneouslymodeltheopposingrole,becauseeachhumansubjectrepeatedlyplaysagainstacomputerplayerthatfollowsitsmixedstrategyequilibrium.Eachsubjectisinformedthathisopponentisacomputerbutisgivennoinformationregardingthecomputer’sstrategy.Weadopttwodifferentgamesinourexperimentaldesign,witheachsubjectplayingonlyoneofthetwogames.Onegameiszero-sumandtheothergameisunprofitable.

3.1Thegames

Ourfirstgameisazero-sumasymmetricmatchingpenniesgameintroducedbyRosenthal,Shachat,andWalker(2003).ThenormalformrepresentationofthisgameispresentedontheleftsideofFigure1.ThegameiscalledPursue-EvadebecausetheRowplayer“captures”pointsfromtheColumnplayerwhentheactionsofthetwoplayersmatch,andtheColumnplayer“evades”alosswhentheplayers’actionsdiffer.InthegameeachplayercanmoveeitherLeftorRight,andthegamehasauniqueNashequilibriuminwhicheachplayerchoosesLeftwithprobabilitytwo-thirds.Inequilibrium,Row’sexpectedpayoffistwo-thirds,andcorrespondinglyColumn’sexpectedpayoffisnegativetwo-thirds.

OursecondgameisanunprofitablegameintroducedbyShachatandSwarthout(2004)calledGamble-Safe.EachplayerhasaGambleaction(Leftforeachplayer)whichyieldsapayoffofeithertwoorzero,andaSafeaction(Rightforeachplayer)whichguaranteesapayoffofone.ThenormalformrepresentationofthisgameispresentedontherightsideofFigure1.TheGamble-SafegamehasauniqueNashequilibriuminwhicheachplayerchoosestheLeftactionwithprobabilityone-half,andeachplayerearnsanexpectedequilibriumpayoffofone.Rightistheminimaxstrategyforbothplayerswithaguaranteedpayoffofone.Aumann(1985)arguesthattheNashequilibriumpredictionisnotplausibleinsuchanunprofitablegamebecauseitsadoptionassumesunnecessaryrisktoachievethecorresponding

9

Pursue-EvadeGame

Column Player

Gamble-SafeGame

Column Player

Left Right Left Right

Left

1 , -1

0 , 0

Left

2 , 0

0 , 1

Row Player Row Player Right 0 , 0 2 , -2 Right 1 , 2 1 , 1

Figure1:Theexperimentalgames

Nashequilibriumpayoff.Forexample,imagineRowhasNashequilibriumbeliefsandbest

respondsbyplayingtheNashstrategy.Row’sexpectedpayoffisone.However,supposeColumninsteadadoptshisminimaxstrategyRight.ThisreducesRow’sexpectedpayofftoone-half.Rowcouldavoidthisriskbysimplyplayingtheminimaxstrategy.ThisaspectmakestheGamble-Safegameamorechallengingtestforthehypothesisofmixedstrategyplaythanthezero-sumPursue-Evadegame.

3.2Subjectrecruitmentandexperimentprotocol

WeconductedsixexperimentsessionsintheFinanceandEconomicsExperimentalLaboratory(FEEL)atXiamenUniversityduringDecember2011.Atotalof110undergraduateandmastersstudentsparticipatedintheexperiment,witheachsessioncontainingbetween12and22subjects.subjectswereassignedtothePersue-Evadegametreatment,and56subjectswereassignedtotheGamble-Safegametreatment.SubjectswereevenlydividedintoRowandColumnplayerroleswithineachtreatment.FEELusestheORSEEonlinerecruitmentsystemforsubjectrecruitment(Greiner,2004),andatthetimeoftheexperimentapproximately1400studentswereinthesubjectpool.Asubsetofstudentsfromthesubjectpoolwereinvitedtoattendeachspecificsession,andthesestudentswereinformedthattheywouldreceivea10Yuanshow-uppaymentandhavetheopportunitytoearnmoremoneyduringtheexperiment.Further,theinvitationstatedthatthesessionwouldlastnomorethantwohours.

Uponarrivalatthelaboratory,eachsubjectwasseatedatacomputerworkstationsuchthatnosubjectcouldobserveanothersubject’sscreen.Subjectsfirstreadinstructionsde-tailinghowtoenterdecisionsandhowearningsweredetermined.6Then,200repetitionsofthegamewereplayed.ForthePursue-Evadegame,Columnsubjectswereinitiallyendowedwithabalanceof260tokenseach,andRowsubjectsnone.Eachtokenwasworthone-third

6

Theinstructionsareavailableathttp://www.excen.gsu.edu/swarthout/HMM/

10

ofaYuan.Eachsubject’stotalearningsconsistedofthe10Yuanshow-uppaymentplusthemonetaryvalueofhistokenbalanceafterthe200threpetition.Whileamathematicalpossibility,noColumnsubjectsinthePursue-Evadegamewentbankrupt.

TheexperimentwasconductedwithaJavasoftwareapplicationcreatedattheGeorgiaStateUniversityExperimentalEconomicsCenter(ExCEN)thatallowshumanstoplaynormalformgamesagainstcomputerizedalgorithms.Atthebeginningofeachrepetition,eachsubjectsawagraphicalrepresentationofthegameonthescreen.EachColumnsubject’sgamedisplaywastransformedsothatheappearedtobeaRowplayer.Thus,eachsubjectselectedanactionbyclickingonarow,andthenconfirmedhischoice.Aftertherepetitionwascomplete,eachsubjectsawtheoutcomehighlightedonthegamedisplay,aswellasatextmessagestatingbothplayers’actionsandhisownearningsforthatrepetition.Finally,eachsubject’scurrenttokenbalanceandahistoryofpastplayweredisplayedatalltimes.Thehistoryconsistedofanorderedlistwitheachrowdisplayingtherepetitionnumber,theactionsselectedbybothplayers,andthesubject’spayofffromthespecificrepetition.

3.3Datasummary

WebeginthesummaryoftheexperimentaldatabyprovidingviewsofthejointdistributionoftheproportionofLeftplayforeachsubject-computerpair,whileconditioningonwhetherthedataarefromthefirst100orlast100repetitions.Figures2and3presenttheseviewsforthePursue-EvadeandGamble-Safetreatments,respectively.Ineachofthesefigures,thex-axisistheproportionofLeftplayfortheColumnplayerandthey-axisistheproportionofLeftplayfortheRowplayer.Eacharrowinthefiguresrepresentstheplayofasinglehuman-computerpair,withthearrowtailrepresentingthejointfrequencyofLeftplayinthefirst100repetitionsandthearrowheadrepresentingthejointfrequencyofLeftplayinthefinal100repetitions.Thesearrowsshowtheadjustmentssubjectsmakefromthefirsthalftothesecondhalfofplay.Weseethatmanyarrowssuggestsubstantialchangeinthehumanplayerfrequency,butthechangesdonottrendinanyonedirectionoruniformlytowardstheNashequilibrium.HumanplayalsodisplaysgreaterdispersionanddisplacementfromtheNashequilibriumthanthecomputeropponents,suggestingnonconformitywiththeNashequilibriumpredictions.

Table1presentsthemeansandstandarddeviationsofsubjects’frequenciesofLeftplaybytreatmentandrole.Recallthatwehave2700observationsfortheeachroleinthePursue-Evadetreatmentand2800observationsforeachroleintheGamble-Safetreatment.AlthoughtheRowplayermeanisclosetotheNashequilibriumproportioninbothgametreatments,theNashequilibriumproportionisrejectedforallfourcasesatanyreasonablelevelofsignificance.Ineachofthefourcases,subjects’proportionsofLeftplaydisplaytoomuchvariancetohave

11

HumanRowvs.NEColumn

1.0

1.0

NERowvs.HumanColumn

0.80.8

0.6

Computer Row Proportion Left0.0

0.2

0.4

0.6

0.8

1.0

Human Row Proportion Left0.6

0.40.4

0.20.2

0.00.0

0.0

0.2

0.4

0.6

0.8

1.0

Computer Column Proportion Left

1.01.0

0.80.8

0.6

Computer Row Proportion Left0.0

0.2

0.4

0.6

0.8

1.0

Human Row Proportion Left0.6

0.40.4

0.20.2

0.00.0

0.0

0.2

0.4

0.6

0.8

1.0

Computer Column Proportion Left

Table1:AggregateSummaryStatisticsStatistic

AverageLeftfrequency

StandarddeviationLeftfrequencyNashequilibriumz-teststatistic

P-ERow0.630.11−6.

P-ECol0.510.15−25.06

G-SRow0.480.15−3.18

G-SCol0.300.20−30.20

theLeftactionisplayedinthefirstandsecond100repetitions.Atwo-tailedbinomialtestoftheNashequilibriumatthe95percentlevelofconfidencegivesuscriticalregionsoflessthan58andmorethan76.WerejecttheNashproportionofLeftplayfor13(12)oftheRowsubjectsduringtheinitial(final)100repetitions,andwerejecttheNashproportionfor21(20)oftheColumnsubjectsduringtheinitial(final)100repetitions.

Next,weevaluatewhetherthesubjects’sequencesofactionchoicesareseriallyindependentviaanonparametricrunstest.Thez-teststatistichasadistributionapproximatetothestandardnormalandisafunctionofthesequencelengthR,andthenumberofLeftandRightsequences,rLandrR,respectively.Itsvalueis

rL+rR−z=󰀈󰀁2rLrRR1−2󰀂.

2rLrR(2rLrR−R)

R2(R−1)Thenullhypothesisofthetestisthatasubject’schoicesareindependentrealizationsofabinomialrandomvariable.Weconductatwo-tailedtest.Rejectionsfromlargervaluesoftheteststatisticindicatetoomanyruns,andaresymptomaticofnegativeserialcorrelation.Rejectionsfromsmallervaluesindicatetoofewruns,andaresymptomaticofpositiveserialcorrelation.FortheRowplayers,werejectserialindependencefor10subjectsinthefirsthalfofthesample,andonly4subjectinthesecondhalf.FortheColumnplayers,thenumberofrejectionsis14and10forthefirstandsecondhalf,respectively.ThereisanotablebiaswithrespecttotheColumnplayers;22outof24oftherejectionscomefromzscoresthataretoonegativeandindicatestrongpositiveserialcorrelation.ThisisconsistentwiththeresultsfoundbyRosenthaletal.(2003)intheoriginalstudyofthePursue-Evadegame,butatypicalforotherstudieswhichoftenfindnegativeserialcorrelation.

Table3presentsasimilardatasummaryfortheindividualsubjectsoftheGamble-Safetreatment.Inthiscase,theNashequilibriummixedstrategyisequiprobable,andthecriticalregionsofthetwo-sidedbinomialtestsare39orlessand60ormoreLeftactionchoices.FortheRowplayers,theNashhypothesisisrejectedfor16subjectsinthefirst100repetitionsand15inthesecond100repetitions.CorrespondinglyfortheColumnplayers,theNashhypothesisisrejectedfor25subjectsinthefirsthalfofrepetitionsand21playersinthe

13

Table2:Pursue-Evadeindividualsubjectsummarydata.

RowPlayer

Rounds1-100

Pair1234567101112131415161718192021222324252627

n

eiz

ColumnPlayer

Rounds1-100LeftCount59n50n51n22n,e56n53n41nn63n,e39n,e46nn61en45n50n50n70e41n45n50n70e585887n,e62e41n

RunsStat−4.44i6.23i1.01−2.45i−1.281.85−0.49−3.58i−1.−3.08i−2.16i−2.57i0.72−2.16i−3.76i−2.01i−2.61i−0.72−0.08−3.15i0.40−1.68−1.391.71−2.98i1.04−2.57i

Rounds101-200LeftCount45n53n71e13n,e44n40n76e52n76e29n,e31n,en48n37n,e41nn20n,e52n22n,e25n,e62e47n78n,e66e100n,e5927n,e

RunsStat−3.35i3.47i0.69−2.08i−0.261.47−2.35i0.22−2.35i−3.96i1.70−3.78i−1.39−1.−5.90i−1.55−3.16i−1.390.49−6.86i1.470.−1.86−1.—z1.58−1.90

RunsStat0.20−0.20−0.60−1.22−0.351.81−0.58−0.191.18−1.73−1.120.940.96−0.36−1.091.39−1.−2.51i1.00−0.09−0.45−0.332.09i−0.24−2.67i−0.82−2.55i

Rounds101-200LeftCount71e66e85e65n,e63e37n,e68e48n84n,e55n55n75e5973e66e88n,e73e40n65e78n,ee84n,e60e43n57n75e

LeftCount77n,e67e77n,e49n5939n,e62e47n51n65e48n63e60e78n,e68e70e46n51n80n,e51n68eee42n76e45n

RunsStat−2.40i1.32−0.12−1.601.170.931.040.44−4.82i−4.53i2.63i−2.29i−1.26−1.110.792.19i0.72−2.16i2.02i2.21i0.61−0.12−0.240.42−0.970.42−6.19i

Two-sidedbinomialtestrejectionoftheNEproportionof2/3atthe5%levelofsignificance.Two-sidedbinomialtestrejectionofequiprobableproportionatthe5%levelofsignificance.Runstestrejectionofserialindependenceatthe5%levelofsignificance.Missingvaluesduetoinapplicabilityoftestondatawithzerovariation.

secondhalfofrepetitions.Also,weseethat9Columnplayersubjectsalmostexclusivelyplaythepureminimaxstrategy(over90times)inthelast100repetitions,whilethereisonlyonesuchRowplayer.Further,wefindevidenceofserialcorrelationinmanyindividuals’choicesequences.FortheRowplayers,werejectserialindependencefor12and9subjectsinthefirstandlast100repetitions,respectively.FortheColumnplayers,serialindependenceisrejectedfor12subjectsinthefirsthalfofrepetitionsand5subjectsinthesecondhalfofrepetitions.

14

Table3:Gamble-Safeindividualsubjectsummarydata.

RowPlayer

Rounds1-100

Pair123456710111213141516171819202122232425262728

m

iz

ColumnPlayer

Rounds1-100LeftCount24m19m36m3m26m70m15m63m33m36m62m24m415114m11m73m34m4m8m2m20m12m65m39m38m30m

RunsStat−2.07i−0.91−3.29i−1.0.66−4.80i1.39−1.86−2.78i−1.33−1.730.70−1.33−4.42i1.230.22−2.15i−0.872.09i−2.33i−3.30i0.240.000.42−2.10i−0.12−2.37i−2.i

Rounds101-200LeftCount14m12m440m31m599m65m19m29m5224m19m498m9m598m62m2m4m0m32m32m4031m446m

RunsStat−0.880.42−5.36i

—z2.41i−0.49−0.24−0.77−3.i−1.271.02−0.691.06−0.200.901.02−0.080.900.830.240.44—z−1.281.04−2.51i−1.60−1.49−3.03i

RunsStat3.261.21−1.08−0.060.97−2.57i−1.43−2.50i−6.41i−3.25i−1.244.27i−2.59i2.19i−0.470.131.81−6.50i−1.58−0.670.96−1.−0.32−0.701.010.30−2.62i0.21

Rounds101-200LeftCount39m5176m41m63m5626m77m76m5315m444441504m28m38m5937m20m594963m24m60m

LeftCount434360m4072m504160m16m39m72mm33m32m4135m31m5668m4620m69m4668m80m63m

RunsStat0.61−3.29i−0.63−1.050.17−1.810.13−3.77i−2.22i−0.123.42i3.48i−3.46i−0.81−1.0.961.17−4.75i−0.−2.10i−0.35−2.77i−2.21i0.05−0.742.42i−4.74i−0.13

Two-sidedbinomialtestrejectionofequiprobableproportionatthe5%level.Runstestrejectionofserialindependenceatthe5%levelofsignificance.Missingvaluesduetoinapplicabilityoftestondatawithzerovariation.

4ResultsoftheHMMstatisticalanalysis

InthissectionwepresenttheestimatedHMMsforthePursue-EvadeandGamble-Safetreat-ments.Firstwereportthemeansandvariancesoftheposteriordistributionsofthetransitionprobabilitymatricesandtheinitialdistributionsoverstates.Theestimatesreflectadoptionofbothpureandmixedstrategiesandcharacterizetheswitchingbetweenlatentstrategies.We

15

thenusetheseestimatestogenerateadescriptionofthedynamicsofthelatentmixedstrategyevolution.Finally,weprovideanassessmentoftherobustnessofsomeofourassumedpriors.

4.1Pursue-Evadegame

ForthePursue-Evadegame,werestrictthelatentstatespaceStocontainfourelements.WetreatthecorrespondingvectorofstatedependentmixedstrategiesBasfixedandknown,andthefourelementsarethepureRightstrategy(PR),thefocalequiprobablemixedstrategy(EM),theNashequilibriumstrategy(NE)oftwo-thirds,andthepureLeftstrategy(PL).Specifically,weassumeapointpriorofB=(0,0.5,0.67,1).UsingthispointpriorweestimatetheHMMusingtheMCMCmethod.

WerunthetheGibbssamplerfor10,000iterations.Usingthelast5000iterations,weestablishthattheempiricaldensityfunctionshaveconvergedbyapplyingtheGeweketest(Geweke,1991).Thenweusetheselast5000iterationstomakestatisticalinferences.Table4presentstheestimatedmeansandstandarddeviationsofthetransitionprobabilitiesbetweenstates,thesamefortheinitialprobabilitiesoverstateposteriors,andthecalculatedlimitingdistributionsoftheMarkovchainsforbothRowandColumn.

Table4:Estimatedtransitionmatrices,initialandlimitingdistributionsofPursue-Evadegame

RowPlayer

PRt+1PRtEMtNEtPLt

π

LimitingDistribution

0.75(0.038)0.025(0.009)0.005(0.002)0.022(0.011)0.082(0.063)0.050

EMt+10.145(0.037)0.95(0.012)0.007(0.003)0.023(0.011)0.614(0.13)0.274

NEt+10.051(0.024)0.013(0.006)0.939(0.014)0.218(0.035)0.193(0.12)0.8

PLt+10.0(0.024)0.013(0.006)0.05(0.013)0.737(0.034)0.111(0.08)0.128

PRt+10.752(0.029)0.095(0.028)0.012(0.008)0.039(0.017)0.043(0.040)0.178

ColumnPlayerEMt+10.204(0.03)0.879(0.033)0.021(0.014)0.034(0.016)0.161(0.127)0.385

NEt+10.025(0.015)0.019(0.012)0.96(0.022)0.031(0.015)0.735(0.141)0.356

PLt+10.018(0.01)0.007(0.005)0.007(0.005)0.6(0.026)0.061(0.057)0.080

Note:standarddeviationsareinparentheses.

OurestimationoftheinitialdistributionoverstatesispresentedinthethefifthnumericrowofTable4.Forbothroleswefindinitialplayhasahighrateofmixedstrategyplay.RowplayerspredominatelyfollowtheEM(61%),whiletheColumnplayerspredominantly

16

followtheNE(74%).Interestingly,thisisquitedifferentfromthelimitingdistributionoftheestimatedtransitionmatrices,whichwecaninterpretasthelongrunsteadystateoftheHMM.FortheRowplayer,themodeofthelimitingdistributionistheNE(55%),whilefortheColumnplayerbothEMandNEareroughlyequallylikely,withprobabilitiesof39%and36%,respectively.Clearlythereismovementofstrategyadoptionovertime.

Someaspectsofthesedynamicscanbeseenbyinspectionoftheestimatedtransitionprob-abilities,giveninthefirstfournumericrowsofTable4.Largevaluesonthemaindiagonalsandcorrespondingsmallvaluesontheoff-diagonalsindicatestronginertiainadoptingnewstrategies.Therearesomeinterestingpatternswhenthereisatransitionbetweenstrategies.ConsidertheRowplayersfirst.WhenswitchingawayfromPRaplayerisalmostthreetimesaslikelytoswitchtoEMthaneitheroftheothertwostrategies.Likewise,whenswitchingawayfromEMaplayeristwiceaslikelytoswitchtoPRthaneitheroftheotherstrate-gies.There’sasimilarprobabilisticcyclebetweenNEandPLwithmuchlargerswitchingprobabilitiesbetweenthem.ThedynamiceffectsofthesecyclingtendenciescanbeseeninFigure4,whichpresentstimeseriesoftheestimatedproportionofsubjectsusingeachofthefourstrategies.7ThePLandtheNEseriestendtomirroroneanother,asdothePRandEMstrategies–albeitwithmorenoise.

TheresultsfortheColumnplayersintherighthandsideofFigure4arequitedifferent.TheuseofNEsteadilydeclineswhiletheadoptionofEMrisesinthefirst50repetitions.Fur-thermore,weseeaslowemergenceofPRoverthecourseoftheexperiment.TheprobabilisticcyclebetweentheEMandPRstrategiesisevidentbytheirsharpmirroringpattern.

Row

1.0NEPLPREM

Column

1.0NEPLPREM

0.8ProbabilitiesProbabilities0

50

100Periods

150

200

0.60.40.20.00.00

0.20.40.60.850100Periods

150200

Figure4:StrategydynamicsinPursue-Evadegame

NextweassesstheappropriatenessofourdegenerateprioronBbyconductingtheMCMC

7127·5000Forstrategyjtheestimatedproportionofsubjectsusingthatstrategyinagivenroundtisˆjt=󰀅10000󰀅27

.l=5001i=1Isli,t=j

17

estimationusingauniformBetaprior,β(Bj;1,1),foreachofthestatedependentmixedstrategies.Wethensamplefromtheposteriordistributionstoconstructanempiricaldensityfunctionforeachofthestatedependentmixedstrategies.InFigure5wepresentkernelsmoothedplotsoftheseapproximationstoposteriordensities.InspectionrevealsfortheRowplayertheposteriorsaresharplypeakedandcloselycenteredonourassumedfourstrategies,exceptfortheNEandtheposteriorwithamodecloseto3/4insteadof2/3.FortheColumnplayerweseethreeoutoffourposteriorscoincidewithourassumedset.TheonedifferenceisthePRandtheposteriorwithamodeofabout0.15.

Row

60505060Column

40DensityDensity0.0

0.2

0.4

0.6

0.8

1.0

302010000.0

102030400.20.40.60.81.0

Probability to Choose LProbability to Choose L

Figure5:PosteriordistributionofBinPursue-Evadegame

4.2TheGamble-Safegame

WenowturnourattentiontotheGamble-Safegame.Here,werestrictthelatentstatespaceStocontainthreeelements.InourestimationwetreatBasfixedandconsistingoftheelementsPR(theminimaxstrategy),EM(whichisalsotheNEstrategy),andPL.WeusethesameparametersfortheGibbsSamplerasweusedinanalyzingthePEdata.

ForboththeRowandColumnplayerdatasetswerantheGibbsSamplerfor10,000iterations,usingthelast5000iterationsforinferenceaftertestingforconvergenceoftheempiricaldensitieswiththeGeweketest.TheposteriormeansandstandarddeviationsarereportedinTable5.ComparingtheestimatedinitialdistributionπtothelimitingdistributionsuggeststhataninitialhighprobabilityofthemixedNashstrategyplayreducesovertimeforbothplayerroles.ThechangefortheColumnplayerismoredramaticasEMgoesfrom60%to40%,andthatreductioncorrespondstoariseintheminimaxstrategyPRfrom34%to53%.

IncontrasttothePEgame,thereisasegregationbetweenmixedstrategyandpure

18

strategyfollowers.EvidenceofthisisfoundintheestimatedMarkovtransitionmatricesaswecanseetheyalmostfailtobeirreducibile(roughlymeaningwecanalwaysreachonestatefromanother,evenifittakesmultipletransitions).TheprobabilityofcontinuingintheEMstateisnearlyone,indicatingthatonceasubjectfollowsthemixedstrategyheislikelytodosoforalargenumberofrepetitions.PurestrategyadoptersexhibitquitedifferentpatternsdependinguponwhethertheyareintheRoworColumnrole,inparticularwithrespecttoswitchingtendenciesinthePLstate.FromthePLstate,RowplayerstransitiontoPRwith26%probability,whilethistransitionprobabilityis79%forColumnplayers.

Table5:Estimatedtransitionmatrices,initialandlimitingdistributionsofGamble-Safegame

RowPlayer

PRt+1PRtEMtPLt

π

LimitingDistribution

0.815(0.027)0.003(0.011)0.260(0.031)0.169(0.084)0.203

EMt+10.010(0.020)0.988(0.021)0.043(0.031)0.779(0.094)0.660

PLt+10.175(0.016)0.009(0.011)0.697(0.037)0.052(0.046)0.137

PRt+10.1(0.010)0.007(0.013)0.791(0.031)0.337(0.098)0.527

ColumnPlayer

EMt+10.006(0.010)0.985(0.019)0.042(0.031)0.596(0.103)0.404

PLt+10.103(0.007)0.008(0.008)0.167(0.044)0.067(0.052)0.059

Note:standarddeviationsareinparentheses.

Figure6presentsthetimeseriesoftheestimatedproportionofsubjectsusingeachofthethreelatentstrategies.HereweseetheimpactoftheMarkovtransitionprobabilitiesthatleadtoinertiaofthemixedstrategystateandalsothestrongcyclingtendenciesofplayersbetweentheLeftandRightpurestrategies.IntheRowplayerfigure,weseetheEMstrategyproportionhasasmoothpaththatdropsquicklyfromitsinitialleveltoitslimitingvaluewithinthefirst50repetitions,afterwhichitremainsrelativelyconstant.Wealsoseetheraggedmirroringpattern,indicatingswitchingbetweenthePRandPLstrategies.WeseesimilarfeaturesintheColumnfigureexceptthattheEMshowsamoregradualdecline,andPRshowsacorrespondinggradualincrease.ThisleadstotheseparationofthePRandPLstrategiesandallowsustoseetheclearshortrunswitchingbetweenthesestrategiescharacterizedbythejaggedmirrorrelationshipbetweentheirrespectiveseries.

WetesttherobustnessofourpointpriorB=[0,0.5,1],byestimatinganHMMforwhichthesestateconditionalstrategieseachhaveauniformBetaprior.ThekernelsmoothedempiricaldensityfunctionsoftheposteriorsarepresentedinFigure7forbothRowand

19

Row

1.00.80.8NEPLPR

Column

1.0NEPLPR

0.60.40.20.0050100Periods

150200

0.00

0.20.40.650100Periods

150200

Figure6:StrategydynamicsinGamble-Safegame

Columnplayers.FortheRowplayer,thelowerandmiddleposteriorsareclosertogetherthanourassumedsets.FortheColumnplayer,theposteriorsofthelowertwostatedependentstrategiesareshiftedtotherightofourassumedones.Weconjecturetheseshiftscouldcomefromerroneouslyassumedhomogeneityofthestrictlymixedstrategyusedbysubjects.AnalternativewouldbetoincreasethenumberofelementsinSortomodeltheindividuals’strictlymixedstrategiescomingfromahierarchicalprocess.

Row

5050Column

4030Density20Density0.0

0.2

0.4

0.6

0.8

1.0

10000.0

102030400.20.40.60.81.0

Probability to Choose LProbability to Choose L

Figure7:PosteriordistributionofBinGamble-Safegame

4.3Forecastingrealizedactions

Untilnowourprimaryconcernhasbeentheestimationofwhensubjectsadoptpureandmixedstrategies,andourHMM’sfunctionhasbeentoprovideastatisticalframeworktotest

20

theoriesaboutlatentstrategychoice.NowweexplorethepotentialoftheHMMtopredictactionstaken;avaluablecapabilityinwidespreadapplicationsfromstrategicmaneuversinmilitaryengagements,toknowingwhenapokerplayerisbluffing.

WefirstconsiderhowwelltheestimatedHMMscoincidewiththeobservedproportionsofLeftplayinourexperimentaldataset.Forthisforecastingexerciseoftheexperimentalpaneldatasetwecalculate,foreachgameandrole,thepredictedproportionofLeftplaybytheM

󰀁t,bysubjectsinperiodt,Left

1󰀆󰀆󰀆󰀁(Il)βj.Leftt=

M·Ll=1d=1j=1sd,t=j

HereListhelengthofsequenceoftheGibbssamplerweuseforstatisticalinference.Forourdatasetsthissequenceisiterations5001to10000.Figure8presentsplotsofthetimeseriesofthepredictedandactualproportionsofLeftplay.Inallfoursettingsthepredictionstrackthetrendsintheactualdata.Admittedlythisisanin-sampleforecastingexercise,butnonethelessstillimpressive,asminimizingforecasterrorisnottheobjectiveofourstatisticalinferenceexercise.

Out-of-sampleforecastingisofmorepracticaluseandwecanusetheHMMforthispurposeaswell.Weestimate,with10,000iterationsoftheGibbssampler,theHMMforbothpointanduniformBetapriorsonBforthefirst100repetitionsandusetheseestimatestomakeone-step-aheadforecastsofthelast100repetitions.LetΨ=(P,π,B,si,t)10000j=5001denotetherealizeddrawsoftheGibbssamplerforthelast5000iterationsoftheMCMCalgorithmthatareusedforstatisticalinferencefortheuniformBetapriorHMM.Thepredictivedensityofsi,tisobtainedbysimulationfromthejointposteriorsampleΨasfollows:

sˆi,t∼p(si,t|P[j],s˜i,t−1),j=5001,...,10000.

[j]

[j]

L

M

N

(1)

Wecanusethesesampledstatesforsubjectitogeneratethefollowing5000drawsfrom

thefollowingmarginalposteriorsample

yˆi,t∼p(yi,t|sˆi,t,B[j]),j=5001,...,10000.

[j]

[j]

(2)

Theaverageofthe5000drawsmadeaccordingtoEquation2,denotedyˆi,t,isthepredictionof

yi,t.Nextweuseyi,ttogeneratetheposteriordensitys˜i,tbyBayes’Rule.ThisissubstitutedintoEquation1tostarttheprocessofgeneratingthepredictionofyi,t+1.Toassesstheaccuracyofourforecastoftheholdoutsample,wecalculateandreporttheLog-likehood

21

PEHuman-Row

1.0Estimated ProportionEmpirical Proportion

PEHuman-Column

1.0Estimated ProportionEmpirical Proportion

0.8Proportion of Left PlayProportion of Left Play0

50

100Periods

150

200

0.60.40.20.00.00

0.20.40.60.850100Periods

150200

GSHuman-Row

1.0Estimated ProportionEmpirical Proportion

GSHuman-Column

1.0Estimated ProportionEmpirical Proportion

0.8Proportion of Left PlayProportion of Left Play0

50

100Periods

150

200

0.60.40.20.00.00

0.20.40.60.850100Periods

150200

Figure8:ActualandforecastedproportionsofLeftplay

statistic

LL(y|Ψ)=

200󰀆M󰀆t=101i=1

ln[I󰀍yi,t󰀎p(ˆyi,t)+(1−I󰀍yi,t󰀎)(1−p(ˆyi,t))].

WealsoreporttheAkaikeinformationcriterionstatistic,whichisAIC(y,yˆ)=−2·(LL−

numberofmodelparameters).

Inordertoevaluatetheabilityofalternativemodelstopredictthefutureactionsingames,wecomparetheperformancesofone-step-aheadforecastingoftheHMMsofpointanduniformBetapriors(UHMM)onBagainstthealternativesoftheNashequilibriumstrategyandindividual-specificmixedstrategies(IM)whichareestimatedbyeachsubject’sproportionofLeftplayinthefirst100repetitions.

Wesummarizetheout-of-sampleforecastingperformanceforeachofthefourmodelsinTable6.First,fortheRowplayersinbothgametreatmentsthetwoHMMsoutperformthetwoothermodelswhenwedoanddonotpenalizeforthenumberofparameters.Forthe

22

holdoutsampleoftheColumnplayersandnotpenalizingforthenumberofparameters,theIMmodelperformscomparabletothetwoHMMmodelsinthePursue-Evadegame,andtheIMmodelperformscomparabletotheUHMMmodel(bothofwhichoutperformtheHMM)intheGamble-Safegame.However,whenwepenalizeforincreasingnumbersofparametersweseetheUHMMclearlyoutperformstheIMmodel.Thissuggeststhatourhomogeneousdynamicmodelperformswellonforecastingapopulationofgameplayers,butalsosuggeststhatallowingformoreindividualheterogeneitycouldleadtoevenbetteroutofsampleforecastingperformance.

Table6:Outofsampleforecastingperformance

RowPlayer

TreatmentP-EGameG-SGame

StatisticLoglikAICLoglikAIC

NE−17483497−19413884

IM−17363526−19163887

HMM−17233475−188737

UHMM−17103458−18653751

NE−20584117−19413885

ColumnPlayerIM−17753605−14292915

HMM−17763583−15173050

UHMM−17713580−143624

5Discussion

WehaveintroducedaHMMforthedetectionofpureandmixedstrategyplayinrepeatedgames.WethenappliedthismodeltodatafromanewexperimentinwhichhumansubjectsrepeatedlyplayagainstcomputeropponentsthatwereprogrammedtoplaytheirpartofthemixedstrategyNashequilibrium.Wefindthatsubjectsdoplaybothpureandmixedstrategies,andswitchbetweentheseoverthecourseofplay.Further,wefindthereisnon-stationarityinthedistributionoflatentstrategiesovertime.WeobservealargemovementfromtheinitialdistributionoverstrategiestothoseofthelimitingdistributionoftheHMM.However,whilethelimitingdistributionassignsprobabilitytothesubjects’NEstrategy,theassignedprobabilityislessthan1.Thus,forourdata,weshowthatamixedstrategyNashequilibriumisonlypartiallyself-enforcing.Thisisanewresultinbehavioralgametheory,aspreviousstudieshaveonlyconsideredthecompositehypothesisthatmixedstrategyequilibriaarebothself-enforcingandalsothelimitpointofthesubjects’learningprocess.

Ourprimaryinteresthasbeenmodelingapopulationofplayersinteractinginagamewithknownpayoffs,howeverthereareseveralnaturalextensionstoourapproach.First,wecouldfocusonthemodelingandforecastingofasinglesubjectfromthepopulation.Todothis,welikelyneedtoallowmoreindividualheterogeneityintheHMM.Afirststepwouldbetoalloweachplayertohaveasetofindividual-specificstrictmixedstrategiestofollow.This

23

coulddonebyallowingindividualstatedependentmixedstrategiesBis,orbymodelingtheseBisascomingfromahierarchalstructurecharacterizedbyasmallsetofhyperparameters.Asecondextensionistomodelstrategicsituationsinthefield,inwhichthegamepayoffsarenotknownbecauseofunobservedindividualheterogeneity.Forexample,soccerplayersmakingpenaltykicksmayvaryintheirstrengthofkickingleftorright,andsimilarlygoaliesalsomayhaveunobservabledifferencesindefendingkickstotheleftandright.Insuchcases,theHMMcanhelpidentifysuchpayoffsandalsodescribetheplayers’learningprocessregardingtheselatentpayofftypes.

TheHMMaspresentedinthispaperiscurrentlymoreofastatisticaldescriptionthanabehavioralmodelderivedfromoptimizingbehavior.Tobecomesuchabehavioralmodel,thetransitionprobabilitiesmustbecomeanendogenousfunctionofaplayer’sexpectedpayoffsforthedifferinglatentstrategychoices.Onepossibleapproachistoallowaplayertoformbeliefsaboutanopponent’sactionandthenbestrespond.Theissuehereisthatinanexpectedutilityworldamixedstrategyisneverastrictbestresponse.However,ifonetakestheapproachthatuncertaintyaboutanopponent’sactionisambiguous–i.e.,aplayerdoesn’thavetheabilitytoformauniqueprior–thenanambiguity-averseplayermaystrictlypreferamixedstrategyoverpurestrategies.

References

Aumann,RobertJ.(1985),“Onthenon-transferableutilityvalue:AcommentontheRoth-Shaferexamples.”Econometrica,53,667–678.

Bareli,M,OAzar,IRitov,YKeidarlevin,andGSchein(2007),“Actionbiasamongelitesoccergoalkeepers:Thecaseofpenaltykicks.”JournalofEconomicPsychology,28,606–621.

Binmore,Ken,JoeSwierzbinski,andChrisProulx(2001),“Doesminimaxwork?anexperi-mentalstudy.”EconomicJournal,111,445–.

Bloomfield,Robert(1994),“Learningamixedstrategyequilibriuminthelaboratory.”JournalofEconomicBehavior&Organization,25,411–436.

Chiappori,Pierre,StevenLevitt,andT.Groseclose(2002),“Testingmixed-strategyequilibriawhenplayersareheterogeneous:Thecaseofpenaltykicksinsoccer.”AmericanEconomicReview,92,1138–1151.

Geman,StuartandDonaldGeman(1987),“Stochasticrelaxation,Gibbsdistributions,andtheBayesianrestorationofimages.”InReadingsincomputervision:issues,problems,

24

principles,andparadigms(MartinA.FischlerandOscarFirschein,eds.),5–584,MorganKaufmannPublishersInc.,SanFrancisco,CA,USA.

Geweke,John(1991),“Evaluatingtheaccuracyofsampling-basedapproachestothecalcula-tionofposteriormoments.”StaffReport148,FederalReserveBankofMinneapolis.Greiner,Ben(2004),“Anonlinerecruitmentsystemforeconomicexperiments.”InForschungundwissenschaftlichesRechnen(KurtKremerandVolkerMacho,eds.),volume63ofGes.furWiss.Datenverarbeitung,79–93,GWDGBericht.

Morgan,JohnandMartinSefton(2002),“Anexperimentalinvestigationofunprofitablegames.”GamesandEconomicBehavior,40,123–146.

Nash,John(1951),“Non-CooperativeGames.”TheAnnalsofMathematics,,286–295.Noussair,CharlesandMarcWillinger(2011),“Mixedstrategiesinanunprofitablegame:anexperiment.”Workingpapers,LAMETA,UniverstiyofMontpellier.

Nyarko,YawandAndrewSchotter(2002),“Anexperimentalstudyofbelieflearningusingelicitedbeliefs.”Econometrica,70,971–1005.

Ochs,Jack(1995),“Gameswithunique,mixedstrategyequilibria:Anexperimentalstudy.”GamesandEconomicBehavior,10,202–217.

O’Neill,Barry(1987),“Nonmetrictestoftheminimaxtheoryoftwo-personzero-sumgames.”ProceedingsoftheNationalAcademyofSciences,U.S.A.,84,2106–2109.

Palacios-Huerta,Ignacio(2003),“Professionalsplayminimax.”ReviewofEconomicStudies,70,395–415.

Rabiner,LawrenceR.(19),“Atutorialonhiddenmarkovmodelsandselectedapplicationsinspeechrecognition.”ProceedingsoftheIEEE,77,257–286.

Rosenthal,RobertW.,JasonShachat,andMarkWalker(2003),“Hideandseekinarizona.”InternationalJournalofGameTheory,32,273–293.

Selten,ReinhardandThorstenChmura(2008),“Stationaryconceptsforexperimental2x2-games.”AmericanEconomicReview,98,938–966.

Shachat,Jason(2002),“Mixedstrategyplayandtheminimaxhypothesis.”JournalofEco-nomicTheory,104,1–226.

25

Shachat,JasonandJ.ToddSwarthout(2004),“Dowedetectandexploitmixedstrategyplaybyopponents?”MathematicalMethodsofOperationsResearch,59,359–373.

VonNeumann,John(1928),“Zurtheoriedergesellschaftsspiele.”MathematischeAnnalen,100,295–320.

VonNeumann,JohnandOskarMorgenstern(1944),TheoryofGamesandEconomicBehav-ior.PrincetonUniversityPress.

Walker,MarkandJohnWooders(2001),“MinimaxplayatWimbledon.”AmericanEconomicReview,91,1521–1538.

26

因篇幅问题不能全部显示,请点此查看更多更全内容

Copyright © 2019- igat.cn 版权所有 赣ICP备2024042791号-1

违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com

本站由北京市万商天勤律师事务所王兴未律师提供法律服务