您好,欢迎来到爱go旅游网。
搜索
您的当前位置:首页Engineering a Semantic Web for Pathology

Engineering a Semantic Web for Pathology

来源:爱go旅游网
EngineeringaSemanticWebforPathology

1

RobertTolksdorf1,ElenaPaslaruBontas2

research@robert-tolksdorf.de,http://www.robert-tolksdorf.de

2

paslaru@inf.fu-berlin.de

FreieUniversit¨atBerlinInstitutf¨urInformatik

AGNetzbasierteInformationssystemeTakustr.9,D-14195BerlinGermany

Abstract.Digitalpathologyortelepathologyintendstoextendtheus-ageofelectronicimagesfordiagnostical,supportoreducationalpurposesinanatomicalorclinicalpathology.Availableapproacheshavenotfoundwideacceptanceinroutinepathology,mainlyduetothelimitationsofimageretrieval.Inthispaperweproposeasemanticretrievalsystemforthepathologydomain.Thesystembringsbothtextandimagein-formationtogetherandoffersadvancedcontent-basedretrievalservicesfordiagnosis,differentialdiagnosisandteachingtasks.ThecoreofthesystemisaSemanticWebgatheringbothontologicaldomainknowledge,andrulesdescribingkeytasksandprocessesinpathology.

1Introduction

Digitalpathologyortelepathologyintendstoextendtheusageofelectronicim-agesfordiagnostical,supportoreducationalpurposesinanatomicalorclinicalpathology.Theadvantagesoftheseapproachesaregenerallyacceptedandseveralapplicationsarealreadyavailable.Nevertheless,noneoftheavailableproductshasfoundwideacceptancefordiagnostictasks,mainlyduetothehugeamountofdataresultingfromthedigitalizationprocessandthelimitationsofimage-basedretrieval.Inthispaperweproposeasemanticretrievalsystemforthepathologydomain.Thesystembringsbothtextandimageinformationtogetherandoffersadvancedcontent-basedretrievalservicesfordiagnosis,differentialdi-agnosisandteachingtasks.ThecoreofthesystemisaSemanticWebgatheringbothontologicaldomainknowledge,andrulesdescribingkeytasksandprocessesinpathology.TheusageofSemanticWebstandardsanddomainontologiesfa-cilitatestherealizationofadistributedinfrastructureforknowledgeshareandexchange.Therestofthispaperisorganisedasfollows:Theremainingintro-ductorysectionspresentthesettingoftheproject,telepathology,anditsmainideasandfeatures.Chapter3providesaninsightintothetechnicalaspectsoftheretrievalsystem,byenumeratingthetechnicalrequirementsandtheassociatedsystemarchitecture,followedbyadetaileddescriptionofthesystemcompo-nents.Atthispointwewillpresentourachievementsandthechallengeswearecurrentlyconfrontedwithintherealizationofthemaincomponents.Chapter4

delimitsourapproachfromrelatedresearcheffortsinthisdomain,whileChapter5isdedicatedtofuturework.

1.1Telepathology

Telepathologyisakeydomainintelemedicine.Byusingtelepathologyapproacheslikevirtualmicroscopy,pathologistsanalyzehighqualitydigitalimagesonadis-playscreeninsteadofconventionalglassslideatthecommonlightmicroscope.Inatypicaldigitalpathologysystem,acameraisattachedtoamicroscopeandstillimagesaretaken.Images(withorwithouttextualannotations)arestoredinadatabaseordirectlyinapatientrecord.Casesandimagescanberetrievedfromthedatabaseorpatientrecordasneeded.

Healthcareinformationsystems,whichstoreandintegrateinformationandcoordinateactionsamonghealthcareprofessionals,havebeenrealizedatvari-ousplacesinthelastdecades.Newdevelopmentsintelemedicineallowmedicalpersonneltoremotelydeliverhealthcaretothepatient.AttheCharit´eInstituteofPathologyinBerlin,thefirstweb-basedvirtualmicroscopeallowshistologicalinformationtobeevaluated,transfered,andstoredindigitalformat[17,14].Thistechniqueoffersessentialadvantagescomparedtotheclassicalapproach,bysupportingcommunicationandexchangeamongprofessionalsnotsharingthesameworkplacelocationandimprovingqualityassurancemechanisms[15].However,torealizeacompletecomputer-basedinfrastructureforpathology,oneneedsnotonlyadvancedsupportinthemanagementofdigitalimages.Necessaryisalsoamoreefficientintegrationofthemedicalfindings,whichareproducedbypathologiststodescribetheirobservationsfromanalyzingtheslidesatthelight/digitalmicroscope.

Commoninformationsystemsinpathologyrestricttheirretrievalcapabilitiestoautomaticalpictureanalysisandignorecorrespondingmedicalfindings.Suchanalysisalgorithmshavetheessentialdrawbackthattheyoperateexclusivelyonstructural–orsyntactical–parameterssuchascolor,textureandbasicgeometricalformswhileignoringtherealcontentandtheactualmeaningofthepictures.Medicalfindings,however,containmuchmorethanthatsincetheyaretextualrepresentationsofthepicturalrepresentedcontentoftheslides.Bythattheycapturetheactualsemanticsofwhatthepicturegraphicallyrepresent,forexample“atumor”incontrastto“aredblob”or“acolocatedsetofredpixels”.Therefore,includingmedicalfindingsintheinformationretrievalsystemgoesbeyondpurelysyntacticpictureretrieval.

Intheprojectdescribedinthispaper,wetakethesemanticsaspectsastepfurther:Weunderstandthefindingsreportassemanticmetadatafortheimagepreparedbyanexpertwithhighquality.Weintendtomakethesemanticcontentexplicitandbuildasystemthattakesadvantageoftheexplicitlyrepresentedknowledge.

2ASemanticWebforPathology

Theproject“SemanticWebforPathology”aimstorealizeaSemanticWeb-basedtextandpictureretrievalsystemforthepathologydomain.Forthispurposeweconcentrateoureffortsinthreeinterrelateddirections:1)theconstructionofaknowledgebase,2)thedevelopmentofknowledgereusealgorithmsandofa3)semanticannotationschemaformedicalfindingsanddigitalhistologicalimages.Theknowledgebasecontainsdomainontologies,genericontologiesandrules.Domainontologiesareusedforthemachine-processablerepresentationofspecificpathologyknowledge,whilegenericontologiescapturecommonsenseknowledgethatcanbeusefulinknowledge-intensivetasks.Severalverycomplexlibrariesofontologiesarealreadyavailableforthispurpose.Rulesareintendedtoformalizethekeytasksineverydaypathology.Whileontologiesmodelthebackgroundknowledgeofthepathologists,therulesareusedtodescribethedecisionpro-cessesusingthisknowledge:diagnostics,microscopeanalysis,observationsetc.Theacquisitionofsuchrules,whichplayacrucialrolefortheretrieval,willbeaccomplishedduringanintensivecollaborationwithdomainexperts.

Furtheron,weanalyzethetextualdatawithtextprocessingalgorithmsandannotateitwithconceptsfromtheknowledgebaseinordertoimproveprecisionandrecallinretrievaloperations.Theannotationschemeisharmonizedwiththepathologyknowledgebasebyusingthecorrespondingmedicalontologiesascontrolledvocabularyfortheannotations.Textanalysisisalsousedtoextractimplicitfactualknowledge,whichissubsequentlyintegratedintheknowledgebase.2.1

Mainfeatures

Weforeseeseveralvaluableusesoftheplannedsysteminroutinepathology.First,itmaybeusedasanassistanttoolfordiagnosistasks.Sinceknowledgeismadeexplicit,itsupportsnewquerycapabilitiesfordiagnosistasks:similarityoridentityofcasesbasedonsemanticrulesandmedicalontologies,differentialdiagnosis,semanticallyprecisestatisticalinformationaboutoccurrencesofcer-taindistinguishingcriteriainadiagnosiscase.Theprovidedinformationwillbeveryvaluableindiagnosisworkespeciallyfortheunderdiagnosedcases,sincesuchsituationsrequiredeeperinvestigationsoftheproblemdomainandaverystrictcontrolmechanismofthediagnosisquality([5]).

Second,advancedretrievalcapabilitiesmaybeusedforeducationalpurposesbyteachingpersonnelandstudents.Currently,enormousamountsofknowledgearelostbybeingstoredindatabases,whicharebehavingasrealdatasinks.Theycanandshouldbeusedforteaching,e.g.forcase-basedmedicaleducation.Third,qualityassuranceandcheckingofdiagnosisdecisionscanbeeffectu-atedmoreefficientlybecausethesystemusesaxiomsandrulestoautomaticallycheckconsistencyandvalidity.

Finally,explicitknowledgecanbeexchangedwithexternalpartieslikeotherhospitals.Therepresentationwithinthesystemisalreadythetransferformatforinformation.SemanticWebtechnologiesarebydesignopenfortheintegrationof

knowledgethatisrelativetodifferentontologiesandrules.Thereforeweintendtousemainlysuchtechnologiesfortherealizationoftheretrievalsystem.2.2

Usecasesandtechnicalrequirements

Thetechnicalanalysisanddesignofthepathologyretrievalsystemiscloselyrelatedtotypicalusagescenarios,whicharenotnecessarilyrelatedtoroutinepathology.Mostprobable,thesystemwillbeusedforunderdiagnosedcases,whereasecondorthirdopinionistobeconsultedorthespecialistusuallyrevertstocertifiedcontrollsources,likeInternetorprintedmaterial.Suchinformationsourceshaveanessentialdrawback:theyofferlimitedcapabilitiesforathemat-icallyfocusedsearch.BothmanualsearchwithinprintedmaterialsandInternetsearch,basedoncommonormedicine-relatedsearchengines,istime-consumingandnotspecificenoughtobeintegratedineverydaypathology.Instead,oursys-temwillofferthepossibilitytosearchthearchivofmedicalfindingsforsimilarcasesordifferentialdiagnosis.Itisimprobablethatthesystemwillbeconsultedforroutinecases,coveringapproximately80percentofthetotalamount,whichareontheflyanalyzedbythepathologistswithouttheneedforadditionalin-formationsources.

Theacceptanceofthesystemisstrictlyrelatedtoitsminimalinvasivecharac-ter:itshouldnotimplyanychangeofthecurrentworkflowsandshouldachievegoodprecisionresults.Recallisalsoimportant,butsincethetwoparametersareusuallycontradictory,wefavorprecision,mainlybecauseofthepredomi-nantusageofthesystemforunderdiagnosedcases,withinwhicheverydetailmayplayanimportantroleforthefinalresults.Theminimalinvasivefeaturewillbereflectedinacarefuldesignoftheuserinterfacesandaintuitivequerylanguage.

Anotherimportantsettingisteaching:therefore,thesystemshouldbeabletogeneratedifferentreferencematerialsandtoretrieveinformationabouttypicalpathologycasesandtheirdiagnosis.Thekeyfeatureforthesecondscenarioistheflexibilitytogenerateandpresentdomaininformation.

Thenetworkaspectisimportantforbothsettings.Pathologistsusethesys-temforcaseswheretheyneedtheremotecollaborationofotherspecialists.Theteachingscenarioasumesalsoadistributedinfrastructure,sothattheresourcescanbeaccessedanytime,anywhere.TheusageofSemanticWebtechnologiesononeside,andofstandardslikeXML/OWLandthemedicalHL7/DICOMisaconditionfortherealizationofthisrequirement.

Scalabilityandperformancearecriticalfactorsfortheacceptanceofretrievalsystem.Inourapplication,theamountofimagedataisimpressive.Everypar-ticularcasecontainsupto10medicalfindings.Eachofthesearebasedonupto50digitalhistologicalimages,whichusuallyhaveasizeof4-5GBeach.Ourfirstprototypicalimplementationofthesystemwilldealwithapproximately400findingsandapartofthecorrespondingdigitisedslides.

Thestorageofimageswillstillbesubjecttotheuseofspecializedimagedatabases.Ourapproachofresortingtothedescriptionofimagescontainedinthefindingsandtheirprocessinginthesystemmakestherequirementson

scalabilitywiththenumberandcomplexityofcasesindependentonthesizeoftheimagedata.Thereisnoimageprocessingforeseen,insteadweusetheresultoftheimageanalysisperformedbyhumanexperts,thepathologists.

RemainingscalabilityandperformanceissuesareaffectedbythequalityoftheunderlyingSemanticWebcomponentsandthecomplexityofmodelsusedandinferencesdrawntherein.Currently,therearestrongefforttoproducein-dustrialstrengthSemanticWebcomponents,suchasinferenceenginesthatgobeyondthepoorperformanceofearlyresearchprototypes.Oursystemwillben-efitfromthisperformancegainintheinfrastructure.

Thecomplexityofmodels,rulesandqueriestriggeringinferencesremainsacriticalissue.Whilewehaveasubstantialbasisofmodelswithexistingstandardsititnotclearyet,whatheuristicsshouldguidetheselectionofthegranularityofmodelseventuallyusedandofthedetailsofrulesappliedwhenfinding“sim-ilar”cases.Wewillrestrictourselvestosmallmodelsandrulesetsthatgenerateasuffientpreciseanswersbythesystemwithminimalinferencingeffort.Theprecisemethodologyfordoingsoissubjectofourcurrentstudies.

3EngineeringtheSystem

TechnicallythesystemresortstoSemanticWebtechnologies.TheSemanticWeb([1])aimstoprovideautomatedinformationaccessbasedonmachine-processablesemanticsofdata.ThefinalvisionistodevelopatechnologicalframeworkthatwilltransformtheWebinanhugenetworkofbothhuman-andmachine-understandableknowledgewithvariousspecializedreasoningservices.Thefirststepsinthisdirectionhavebeenmadethroughtherealizationofap-propriaterepresentationlanguagesforWebknowledgesourceslikeRDF(S)andOWLandtheincreasingdisseminationofontologies,thatprovideacommonbasisforannotationandsupportautomaticinferencingforthegenerationofknowledge.

OurapproachmakesuseoftheseSemanticWebtechnologiesinordertorepresentpathologyknowledgeexplicitlyand,consequentlyrefinetheretrievalalgorithmsonasemanticlevel:medicalandgenericontologiesareintegratedintoapathologyknowledgebase,whichservealsoasannotationvocabularyformedicalfindingsandhistologicalimages.WeuseOWLandRDF(S)bothfortherepresentationoftheknowledgebaseandfortheannotationoftheinformationitemsandXML-basedmedicalstandardslikeHL7/CDA([10,9])forthemedicalfindings.

Inmedicineandbiologyexhaustivedomainontologieshavebeendevelopedandareconstantlyincorporatingnewpiecesofknowledge.OntologieslikeUMLS([16]),GALEN([6]),GeneOntology([4])provideagoodbasisforthedevelop-mentofSemanticWebapplicationsformedicinepurposes.Theseontologiesarethereforeusedastheinitialknowledgebaseofthesemanticalretrievalsystemforpathology.Inaddition,toputourgoalsintopracticewestillneedtointegratetheindividualdomainknowledgesourcesandtoadaptthemtotherequirementsoftheSemanticWeb,whichmeansinthefirstplacetoformalizetheminaSe-

manticWebrepresentationlanguage.Ouranalysisintheapplicationdomainhasrevealedthenecessityofapowerfulrepresentationlanguage,whichcancap-turemostofthesemanticalfeaturesofthemedicalknowledge.ForthispurposewewillusemostlyOWLinsteadofRDF(S),mainlybecauseofitsexpressive-nessandinferencingcapabilities.Themainissuesweaddressw.r.t.theavailablemedicalontologieswillbeexplainedindetailinSection3.23.1

Systemarchitecture

Weproposethefollowingsystemarchitecture,whichhasarisenfromtheusecasesandthecorrespondingtechnicalrequirements(Figure1):––––

descriptioncomponentknowledgecomponent

transformationcomponentapplicationcomponents

Inthefollowingwebrieflyexplaintheroleofeachcomponentandtheirinterac-tion,adetaileddescriptionofthefeaturesandrelatedresearchissuesispresentedinSection3.2.

digitalmicroscopegeneratesdigital istological imagesdescription omponentgeneratesompounddescribed bprovidesinformation aboutompoundattribtesdescribed medical findingmedical findingmedical findingprovides informationaboutstatisticalreferencesteacing materialsase-basedpresentationsality cecingapplicationomponentssemanticaldescriptiongeneratesdescribed generatessed indiagnosisretrievaltransformationomponentnowledgeomponentFig.1.Systemarchitecture“SemanticWebforPathology”

Thecoreofthesystemarchitectureistheknowledgecomponent(Figure3),whichconsistsofdomainandgenericontologies,aswellasaruleengine.Theknowledgecomponentinfluenceseveryprocessoftheremainingcompo-nents.Themedicalfindingsandhistologicalimagesareanalyzedsemanticallyandlinguisticallywithinthedescriptioncomponent.Theexplicitelyrepresentedknowledgeisusedtochecktheconsistencyofmedicalfindingsandpicturean-notationsduringtheirgeneration.ThedescriptioncomponentalsoallowstheXMLencodingofthetextualandpicturaldata.BoththeavailablepathologydatabaseattheCharit´ehospitalanddatatobegeneratedaredescribedinXMLinthismanner.ThetransformationcomponenttakestheXML-structureddatasetandintegratesitwithinthesemanticnetworkunderlyingtheknowledgecom-ponent.Duetotheapplication-orientedcaracterofthesystem,specialattentioninthearchitectureispaidtotheapplicationcomponents,whichimplementthefunctionalityofthesystemaspresentedinSection2.Thesearchcomponentisusedbothbypathologistsinordertoretrieveinformationconcerningdiagnosistasksorbyteachingpersonnelandstudents.Weplanalsoacomponentforthegenerationofstatisticalevaluations(e.g.relatedtothemostfrequentdeseasesymptoms,relationshipsbetweenpatientdataanddeseaseevolutionetc.)andforthegenerationofcase-orientedteachingmaterialsandpresentations(seeFig-ure1).Thequalitycheckingserviceisintendedtoevaluatetheconsistencyofmedicalfindings.3.2

Maincomponents

TheDescriptionComponentThedescriptioncomponentisconcernedwiththebasicformalizationofmedicalfindingsanddigitalhistologicalimages.Forthispurposeitdealswithtwoprincipaldatasources:data,whichisalreadyavail-ableattheInstituteofPathologyattheCharit´ehospitalandfuturedata.Thegoalofthisprocessistoofferahomogeneousencodingofmedicalfindings,ononesideandpictureannotationsontheotherside,bothforexistentandfuturematerial.ThedatashouldbefirstencodedinXMLandsubsequentlyanalyzedusingontology-enhancedtextanalysisalgorithmsinordertobeannotatedwithontologyconcepts.ForthegenerationofnewXML-basedinformationwedevel-opedaneditortool,whichcanbeintegratedintheactualversionoftheDigitalVirtualMicroscope([17,14]).Bymeansofthistoolpathologistscananalyzedigi-tisedhistologicalimagesandsimultaneouslyenterorupdatethecorrespondingmedicalfinding,whichissubsequentlystoredinaXMLdatabase.ThesecondsourceofrawdatawasnaturallythemedicalfindingsarchiveattheCharit´e.Themedicalfindingsofthistypehavebeenextractedfromtheirprimarytext-orientedstorageandtransformedinXML.

WedevelopedaHL7/CDAcompatibleXML-schemeforthemedicalfindings,whichreflectthelogicalstructureofthedata.Suchmedicaldataisorganizedmoreorlessconsequenltyinfourmajorparts:

–macroscopydescribingphysicalpropertiesandtheappereanceoftheorig-inalcompound.

–microscopyconcernedwiththedetaileddescriptionoftheslidesanalyzedatthemicroscope.

–diagnosissumarizingtheconclusionsandthediagnosis

–commentsusuallypresentingadditionalfactsplayingaroleinthediagnosisargumentation(patientdata,patienthistoryetc.)oranalternativediagnosisforambiguouscases.Besides,suchamedicalfindingcontainsalsoinformationfromthepatientrecordandreferencestodigitalimages.Theconnectiontothedigitalimagesisfunda-mentalforanefficientretrieval,whichshouldcontainapartfromtherelevanttextualinformationthecorrespondingimageregionthepathologistreferstoinacertainportionoftext.Sincethesizeofsuchimagesis4-5GB,itisnotsuffi-cienttoretrievecompleteimagestoacertainuserquery,buttheconcreteimagesector.ForthispurposeweusethefunctionalityoftheDigitalVirtualMicro-scope,whichallowsdigitalslidestobeannotatedwithso-called“observationpaths”ononeside,andregistryanadditional“dictationpath”.Theobservationpathcontainsimagecoordinates,imageresolutionandtimestampsregisteredwhilethepathologistwasanalyzingaspecificdigitalimage.Thedictationpathsumsupthesamedata,thistimeregisteredwhilethepathologistwastypingthemedicalfinding.Thecompletepath-relatedinformationflowsinthe“diagnosispath”,whichmirrorsthewaythediagnosisdecisionwasaccomplished.

TheproposedXML-SchemereconstructsthestructureoftherealmedicalfindingsandisHL7-compatible.ThoughthecompatibilityrestrictstheformatoftheXMLfindings(theinformationmustbeencodedwithin“section”,“para-graphs”and“codedentry”tags,whichisnotnecessarilythemoststraightfor-wardmannerofformalizingit),itisanimportantissue,especiallyforthedis-tributedsetting,fortheexchangeandreuseofinformation.

TheKnowledgeComponentTheknowledgecomponentincludesthemedicalknowledgebaseandthealgorithmsfortherealizationoftheapplications.AsmentionedinSection3.1itisbuildofalibraryofdomainandgenericontologies,aruleengineandtheannotatedpathologydata(Figure3).Weuseavailablemedicalontologiesasafoundationoftheknowledgebase,startingwithUMLS([16])andGeneOntology([4]).

Themostimportantissuewehavetoaddresswhenbuildingthepathologyknowledgebaseistheintegrationandtheenrichmentoftheavailablemedicinestandards.Medicineontologiesthoughcontainingahugeamountofconceptsorterminihaveseldombeendevelopedformachineprocessing,butratherascontrolledvocabulariesandtaxonomiesforspecifictasksinmedicine([13]).FromastrictSemanticWebpointofviewtheyprovedtobedeficientlyde-signedandincomplete.ApartfromtheabsenceofanatleastSemanticWebcompatiblerepresentationlanguage,UMLSandGeneOntologyadoptanerror-pronemodelingstyle,whichischaracterizedbyfewsemanticrelationsamongconceptsandanambiguouswaytointerpretsuchrelations(e.g.conceptsoftheUMLSMetathesaurusareconnectedthroughrelationslike“related”,“broader”,

xmlns:sciphox=\"urn::sciphox-org/sciphox\"

xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"xsi:schemaLocation=\"urn::hl7-org/cdasciphox-cda.xsd\"xmlns:swpatho=\"urn::swpatho-org\">...

...

Befund

Makroskopie

ZweiGewebszylindervon15und4mmL¨ange.

Mikroskopie

StanzbiopsateausLungengewebemitdeutlicherSt¨orungderalveol¨arenTextur,soweitnochnachweisbardeutlichVerbreiterungderAlveolarsepten,stellenweiseNachweisvon

Bronchialepithelregeneraten.RestlicheAlveolarluminaz.T.durchFibroblastenproliferateverlegt.ImInterstitiumeingemischtentz¨undlichesInfiltrat,bestehendausPlasmazellenundLymphozyten....

Kritischer_Bericht

StanzbiopsateausderLungemitZeichender

organisierendenPneumonie(klin.Mittellappen).

Kommentar

NachklinischerAngabevordiagnostizierteskutanesT-Zell-Lymphom,jetzt2bis3cmgroßepleurast¨andigeRaumforderungimMittellappen.ImvorliegendenMaterialkeinAnhaltf¨ureineLymphom-Manifestation.KeinKarzinom.

Fig.2.FragmentofanXML-encodedmedicalfinding

knowledgeomonentmedicalloieseneOlomedicalindinimage anatioleenineFig.3.Theknowledgecomponent

“narrower”).Atypicalexampleistheusageoftherelation“is-a”forbothin-stantiationandspecialization/generalization,theusageofaunique“part-of”relationwithdifferentmeanings(“functionalpart”,“content”,“component”,“substance”)ortheusageofoneoftheserelationsinsteadoftheother.Math-ematicalpropertiesofthesamesemanticalrelation(e.g.transitivity)arenotfulfilledforeachpairofconceptsconnectedbytherelationandthe“is-a”re-lationbetweentwoconceptsdoesnotalwaysguaranteetheinheritanceofthepropertiesoftheparentconcepttoitschildren(so-called“blocked”relationsinUMLS).Besidesrelations,bothUMLSandGeneOntologycontainahugesetofconceptualentities,organizedinseveraltaxonomies.Theclassificationcriteriaforconceptsareinconsistentandincomplete.Different,unspecifiedgranularitiesareusedwithinahierarchyandpropertiesmaynotbeinheritedalonginheri-tancepaths.

Theissueoftherestrictedrepresentationlanguageisaddressedintheseveralprojects,whichusuallydevelopetheirownrepresentationlanguages,adaptedtospecificrequirementsofthemedicaldomainw.r.texpressivenessandinferencingcapabilities.Suchanontology,thoughwithincreasedinferencecapabilitiescom-paredtoUMLSorGeneOntologycannotbeembeddedoffhandinaSemanticWebapplication,sharedorcompletedinaSemanticWebsetting.Evenmore,variousontologieshavebeendevelopedforparticularpurposesandandcannotbeintegratedautomatically.Besidesintegrationandcompletionsuchontolo-giesdoseldomcontainaxiomaticknowledgewhichisessentialfordiagnosticsortherapysettings.

ThereforeweneedtoadoptaSemanticWebrepresentationschemefortheavailableontologicalknowledge,completeitwithadditionalaxiomsanddefini-tionsononehandandontheotherhandencodetherapy,diagnosticandtaskknowledgeinasupplementarymoduleasrules.ForthispurposewewillusestandardslikeRuleML.InordertodesignanappropriaterepresentationbasedonSemanticWebwewillfirstidentifyincollaborationwithdomainexpertsthefragmentsofUMLS/GeneOntology,whicharerelevantinthepathology.Secondlywewillanalyzethedeficienciesoftheavailablemedicalstandardsby

transformingtheircontentinOWLandautomaticallydiscoveringinconsisten-cies.ThenextstepwillbethemanualadaptationoftheOWLontologyaccordingtotheresultsofthepreviousprocedure.Currentlyweareimplementinganalgo-rithmfortheOWLtransformationofUMLSknoweldgesources.TheunderlyingmodellingprimitivesareillustratedinFigure4.

Fig.4.UMLSmodellingprimitivesinOWL

Besidesanontology-basedbackgroundtheknowledgebasealsocontainsthecompletesetofmedicalfindingsandimagedescriptions,bothrepresentedinXMLbymeansofthedescriptioncomponent.However,inorderforthisaddi-tionalinformationtobeinvolvedinretrievalandknowledgediscovery,theXMLbasicschemeneedstobeenrichedwithannotationsreferencingontologycon-ceptsandrelations.Forthispurposeweintendtousetextprocessingalgorithmsforaninitialautomaticannotationphaseandtoimplementanannotationtoolforasubsequentmanualannotationphase,whichcompletestheautomaticpro-cess.

TheTransformationComponentThetranformationcomponentimplementsfeaturesrequiredforthetext-basedprocessingofthemedicalfindingsandimagedescriptions.Forthispurposewearecurrentlydevelopinganounphrasingmod-ule,whichidentifiesdomain-specificphrasesfrommedicalfindings.Themodulesincorporatesatokenizer,ataggerandaontology-basedphrasegenerator.Thephrasegenerationprocessinterractswiththeknowledgebase,sinceitusesmedi-calontologiestoidentifyrelevant(multi-word)phrasesandinthesametimeputstogetheralexicon,tailoredfortheparticularapplicationsetting:thedomainoflungpathologyandthelanguageusedinthemedicalfindings,whichisGerman.ThelexiconprovidesusindicationsabouttheusagelimitationsofanessentiallyEnglish-orientedthesauruslikeUMLSinourconcretesetting.Asaresultofthephrasingmodule,theXML-encodedmedicalfindingscontainsemanticrelevantphrases,whichcanbereferencedtoconceptsoftheknowledgebase.Thistaskwillberealizedbytheannotationcomponent.

ApplicationComponentsTheSemanticWebforPathologywillassistthefollowingapplicationcomponents:

–searchcomponentwillbeusedprimarilyfordiagnosistasks.Itwillallownotonlythebasicretrievaloftext/imageinformationitems,butalsosupportdifferentialdiagnosistasks.Thesemanticretrievalisorientedtowardsseveraltypicalcategoriesofqueries:

•statisticalqueriese.g.theprobability/frequencyofaparticularcarci-nomainacertainagegroup.

•matchingqueriese.g.comparisonofcaseswithcommoncharacteris-tics,textandimageinformationtosimilarcases.

•imagequeriese.g.casescontainingimageswithcertaincontent-orimage-specificconstraints.

Besides,theretrievalshouldbeadaptedtothecharacteristicsofthepathol-ogydomainandinvolveissueslikethediagnosispath.(Section3.2).

–qualitycheckingcomponentwillbeusedinqualityassurenceandman-agementofdiagnosisprocesses.Qualitycriteria,diagnosisstandardsandtheirverificationareexpressedbymeansofrules.

–statisticalcomponentwillgeneratestatisticalmaterialrelatedtotherel-ativefrequencyordemographicdistributionofdiseasesandtheircomplica-tions.

–teachingcomponentwillgenerateteachingmaterials,usingfeaturesofthepreviouscomponents(statisticalstudies,referencecases)

4RelatedWork

Medicineisoneofthebestexamplesofapplicationdomainswhereontologieshavealreadybeendeployedatlargescaleandhavealreadydemonstratedtheirutility.Mostofthesedomainontologies(UMLSinclusively)underliedifferentdesignrequirementsascomputersupportedandevenmorespecificSemantic

Webapplications.Theyareactuallyhugecollectionsofmedicalterms,organizedinhierarchiesandcannotbeuseddirectlyinSemanticWebapplications.ThisissuehasbeenaddressedinprojectGALEN([6]),wheretheauthorsdevelopedaspecialdescriptionlogicrepresentation,tailoredfortheparticularitiesofthe(English)medicalvocabulary.However,theusageofaproprietaryrepresentationmakestheontologicalknowledgedifficulttobeextendedbythirdpartiesorexchangedinaSemanticWeb.

Theusageofontologiesforbuildingknowledgebasesformedicinehasalreadybeensubjectofseveralresearchprojects([2,12,7,3,8]).ThemostimportantrepresentativesaretheONIONS([7])andMEDSYNDIKATE([12])projects.InONIONStheauthorsaimtodevelopagenericframeworkforontologymerginganduseUMLSasanexampletoapplytheirmethodology.ThereforetheyneedadetailedanalysisoftheontologicalpropertiesofUMLS,usingaLoomformal-ization.MEDSYNDIKATEisalsoconfrontedwiththeontologicalcommitmentbeyondUMLSinordertouseitintextprocessingalgortihmsforknowledgediscovery.UMLSservesinthiscaseasanannotationvocabularyformedicaltexts.BothprojectsoffervaluableexperiencesandfactsconcerningUMLSandmedicalontologiesgenerally,buttheydonotuseSemanticWebtechnologiestofacilitateknowledgeshareandreuse,whichisthecrucialfeatureofontologies.Aninterestingapproachcanalsobefoundin[2],wheretheauthorscompareUMLSwithotherontologies(e.g.WordNet([11],GeneOntology)toestablishitsappropriatenessasterminologyforbiomedicalapplications.

5ConclusionsandFutureWork

InthispaperwepresentedourworktowardsaSemanticWebbasedretrievalsystemforpathology.Thesystemisbasedonacomprehensiveknowledgebase,whichformalizespathology-relevantknowledgeexplicitlybyintegratingavail-ablemedicineontologieslikeUMLSandrulesdescribingdiagnosticguidelines.Itisintendedtoprovidebothretrievalandknowledgemanagementfunctionali-ties.Inordertoachievethesegoalswedesignedbynowthesystemarchitecture,adoptedXML-basedschemesfortheuniformrepresentationofmedicalfindingsanddigitalimagesanddevelopedamethodologyfortheconstructionofthepathologyknowledgebase.Currentworkincludesthespecificationandimple-mentationofanalgorithmfortheOWLformalizationofmedicalontologiesandtheirintegrationintheknowledgebase.

AcknowledgementTheproject“SemanticWebinthePathology”isfundedbytheDeutscheForschungsgmeinschaft,asacooperationamongtheCharit´eInstituteofPathology,theInstituteforComputerScienceattheFUBerlinandtheDepartmentofLinguisticsattheUniversityofPotsdam,Germany.

References

1.T.Berners-Lee,J.Hendler,andO.Lassila.”TheSemanticWeb”.ScientificAmer-ican,284(5):34–43,52001.

2.A.BurgunandO.Bodenreider.MappingtheUMLSSemanticNetworkintoGeneralOntologies.InProc.oftheAMIASymposium,2001.

3.G.CareniniandJ.Moore.”UsingtheUMLSSemanticNetworkasaBasisforConstructingaTerminologicalKnowledgeBase:APreliminaryReport”.InPro-ceedingsof17thSymposiumonComputerApplicationsinMedicalCare(SCAMC’93),1993.

4.TheGeneOntologyConsortium.GeneOntology:toolfortheunificationofbiology.NatureGenetics,25:25–30,2000.

5.F.Demichellis,V.DellaMea,S.Forti,P.DallaPalma,andC.A.Beltrami.”Digitalstorageofglassslideforqualityassuranceinhistopathologyandcytopathology”.TelemedTelecare,8(3):138–42,2002.

6.OntologyGALEN.http://www.opengalen.org,2001.

7.A.Gangemi,D.M.Pisanelli,andG.Steve.”AnOverviewoftheONIONSProject:ApplyingOntologiestotheIntegrationofMedicalTerminologies”.DataKnowledgeEngineering,31(2):183–220,1999.

8.H.Gu,Y.Perl,J.Geller,M.Halper,L.Liu,andJ.Cimino.”RepresentingtheUMLSasanOODB:Modelingissuesandadvantages”,2000.

9.HL7Standard.http://puck.informatik.med.uni-giessen.de/people/messaritakis/-hl7xml/hl7stand.htm,2000.

10.TheHL7/CDAStandard.http://www.hl7.org,2000.

11.G.A.Miller.”WordNet:alexicaldatabaseforEnglish”.Communicationsofthe

ACM,38(11):39–41,1995.

12.S.SchulzandU.Hahn.”Medicalknowledgereegineering-convertingmajorpor-tionsoftheUMLSintoaterminologicalknowledgebase”.InternationalJournalofMedicalInformatics,2001.

13.S.Schulz,M.Romacker,andU.Hahn.”KnowledgeengineeringtheUMLS”.Stud

HealthTechnolInform,77:701–5,2000.

14.Patentanmeldung:SlideScanner–VorrichtungundVerfahren,2002.Aktenzeichen

102317.6desDPMAvom5.8.2002.

15.J.Slodkowska,K.Kayser,andPHasleton.”TeleconsultationintheChestDisor-ders”.EurJMedRes,7(SupplI):80,2002.

16.UnifiedMedicalLanguageSystem.http://www.nlm.nih.gov/research/umls,2002.17.Patentanmeldung:VirtuellesMikroskop–VorrichtungundVerfahren,2002.Ak-tenzeichen10225174.6desDPMAvom31.05.2002.

因篇幅问题不能全部显示,请点此查看更多更全内容

Copyright © 2019- igat.cn 版权所有 赣ICP备2024042791号-1

违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com

本站由北京市万商天勤律师事务所王兴未律师提供法律服务