Package: polmineR 0.8.9

Andreas Blaette

polmineR: Verbs and Nouns for Corpus Analysis

Package for corpus analysis using the Corpus Workbench ('CWB', <https://cwb.sourceforge.io>) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.

Authors:Andreas Blaette [aut, cre], Christoph Leonhardt [ctb], Marius Bertram [ctb]

polmineR_0.8.9.tar.gz
polmineR_0.8.9.zip(r-4.5)polmineR_0.8.9.zip(r-4.4)polmineR_0.8.9.zip(r-4.3)
polmineR_0.8.9.tgz(r-4.4-any)polmineR_0.8.9.tgz(r-4.3-any)
polmineR_0.8.9.tar.gz(r-4.5-noble)polmineR_0.8.9.tar.gz(r-4.4-noble)
polmineR_0.8.9.tgz(r-4.4-emscripten)polmineR_0.8.9.tgz(r-4.3-emscripten)
polmineR.pdf |polmineR.html
polmineR/json (API)
NEWS

# Install 'polmineR' in R:
install.packages('polmineR', repos = c('https://polmine.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/polmine/polminer/issues

On CRAN:

121 exports 48 stars 3.28 score 46 dependencies 307 scripts 535 downloads

Last updated 11 months agofrom:842d4a6854. Checks:OK: 7. Indexed: yes.

TargetResultDate
Doc / VignettesOKSep 07 2024
R-4.5-winOKSep 07 2024
R-4.5-linuxOKSep 07 2024
R-4.4-winOKSep 07 2024
R-4.4-macOKSep 07 2024
R-4.3-winOKSep 07 2024
R-4.3-macOKSep 07 2024

Exports:%>%aggregateannotationsannotations<-as_igraphas.bundleas.corpusEncas.cqpas.data.frameas.data.tableas.DocumentTermMatrixas.listas.markdownas.matrixas.nativeEncas.partition_bundleas.partitionBundleas.phrasesas.regionsas.simple_triplet_matrixas.sparseMatrixas.speechesas.TermDocumentMatrixas.utf8as.VCorpusbarplotblapplybrowsecapitalizecheck_cqp_querychisquarecolnamesconcatenate_phrasescontextcooccurrencesCooccurrencescorpuscountcpcposdata_dirdecodedispersiondotploteditencodingencoding<-enrichfeaturesflattenformatget_corpusget_infoget_templateget_token_streamget_typegetEncodinggetTermsgetTokenStreamheadhighlighthisthitshrefhtmlis_nestedis.cqpis.partitionknit_printkwicllmailmergenamename<-ncolngramsnoisenrowocpu_execp_attributespartitionpartition_bundlepartitionBundlepAttributespmipolmineRpunctuationrangesreadregionsregistryregistry_get_encodingregistry_get_homeregistry_get_idregistry_get_inforegistry_get_nameregistry_get_p_attributesregistry_get_propertiesregistry_get_s_attributesregistry_moveregistry_resetrestores_attributessamplesAttributesshowshow_infosizesortsplitsubsetsummarytailtermstooltipstree_structuretrimuseviewweigh

Dependencies:base64encBHbslibcachemclicrosstalkdata.tabledigestDTevaluatefastmapfontawesomefsgluehighrhtmltoolshtmlwidgetshttpuvjquerylibjsonliteknitrlaterlatticelazyevallifecyclemagrittrMatrixmemoisemimeNLPpbapplypromisesR6rappdirsRcppRcppCWBrlangrmarkdownsassslamstringitinytextmxfunxml2yaml

Encodings

Rendered fromencodings.Rmdusingknitr::rmarkdownon Sep 07 2024.

Last update: 2021-03-17
Started: 2021-03-17

Introducing the 'polmineR'-package

Rendered fromvignette.Rmdusingknitr::rmarkdownon Sep 07 2024.

Last update: 2022-05-08
Started: 2017-04-19

OpenCPU

Rendered fromOpenCPU.Rmdusingknitr::rmarkdownon Sep 07 2024.

Last update: 2021-02-08
Started: 2019-11-14

Readme and manuals

Help Manual

Help pageTopics
polmineR-packagepolmineR-package polmineR
Annotation functionalityannotations annotations,kwic-method annotations,textstat-method annotations<- annotations<-,kwic,list-method annotations<-,textstat,list-method edit,textstat-method
Get markdown-formatted full text of a partition.as.markdown as.markdown,partition-method as.markdown,plpr_partition-method as.markdown,plpr_subcorpus-method as.markdown,subcorpus-method
Type conversion - get sparseMatrix.as.sparseMatrix as.sparseMatrix,bundle-method as.sparseMatrix,DocumentTermMatrix-method as.sparseMatrix,simple_triplet_matrix-method as.sparseMatrix,TermDocumentMatrix-method
Split corpus or partition into speeches.as.speeches as.speeches,character-method as.speeches,corpus-method as.speeches,partition-method as.speeches,subcorpus-method
Generate TermDocumentMatrix / DocumentTermMatrix.as.DocumentTermMatrix as.DocumentTermMatrix,bundle-method as.DocumentTermMatrix,character-method as.DocumentTermMatrix,context-method as.DocumentTermMatrix,corpus-method as.DocumentTermMatrix,partition_bundle-method as.DocumentTermMatrix,subcorpus_bundle-method as.TermDocumentMatrix as.TermDocumentMatrix,bundle-method as.TermDocumentMatrix,character-method as.TermDocumentMatrix,context-method as.TermDocumentMatrix,partition_bundle-method as.TermDocumentMatrix,subcorpus_bundle-method
Get VCorpus.as as.VCorpus as.VCorpus,partition_bundle-method
apply a function over a list or bundleblapply blapply,bundle-method blapply,list-method blapply,vector-method
Bundle Class$,bundle-method $<-,bundle-method +,bundle,bundle-method +,bundle,textstat-method as.bundle,list-method as.bundle,textstat-method as.data.table.bundle as.list,bundle-method as.list.bundle as.matrix,bundle-method bundle bundle-class get_corpus,bundle-method length,bundle-method name<-,bundle-method names,bundle-method names<-,bundle,vector-method sample,bundle-method subset,bundle-method unique,bundle-method [,bundle,ANY,ANY,ANY-method [[,bundle-method [[<-,bundle-method
Capitalize character vector.capitalize
Perform chisquare-text.chisquare chisquare,context-method chisquare,cooccurrences-method chisquare,features-method
Analyze context of a node word.as.matrix,context_bundle-method context context,character-method context,cooccurrences-method context,corpus-method context,matrix-method context,partition-method context,partition_bundle-method context,slice-method context,subcorpus-method
S4 context_bundle classcontext_bundle-class show,context_bundle-method summary,context_bundle-method [,context_bundle,ANY,ANY,ANY-method [,context_bundle-method [[,context_bundle-method
Context class.as.DataTables,context-method as.regions,context-method context-class count,context-method enrich,context-method head,context-method length,context-method p_attributes,context-method sample,context-method show,context-method summary,context-method trim,context-method [,context,ANY,ANY,ANY-method [,context-method [[,context-method
Get cooccurrence statistics.cooccurrences cooccurrences,character-method cooccurrences,context-method cooccurrences,Cooccurrences-method cooccurrences,corpus-method cooccurrences,partition-method cooccurrences,partition_bundle-method cooccurrences,remote_corpus-method cooccurrences,remote_subcorpus-method cooccurrences,slice-method cooccurrences,subcorpus-method
Cooccurrences class.as.data.frame,cooccurrences_bundle-method cooccurrences-class cooccurrences_bundle cooccurrences_bundle-class cooccurrences_reshaped-class format,cooccurrences-method show,cooccurrences-method view,cooccurrences-method view,cooccurrences_reshaped-method
Cooccurrences class for corpus/partition.as.simple_triplet_matrix,Cooccurrences-method as.sparseMatrix,Cooccurrences-method as_igraph as_igraph,Cooccurrences-method Cooccurrences-class decode,Cooccurrences-method enrich,Cooccurrences-method kwic,Cooccurrences-method subset,Cooccurrences-method
Get all cooccurrences in corpus/partition.Cooccurrences Cooccurrences,character-method Cooccurrences,corpus-method Cooccurrences,partition-method Cooccurrences,slice-method Cooccurrences,subcorpus-method
Corpus class initializationcorpus corpus,character-method corpus,missing-method corpus-class get_corpus remote_corpus remote_corpus-class zoom
Corpus class methods$,corpus-method corpus-methods get_corpus,corpus-method get_info,corpus-method name,corpus-method show,corpus-method show_info,corpus-method
Get counts.count count,character-method count,corpus-method count,partition-method count,partition_bundle-method count,remote_corpus-method count,remote_subcorpus-method count,subcorpus-method count,subcorpus_bundle-method count,vector-method count-method
Count class.count-class count_bundle-class count_class hist,count-method length,count-method summary,count-method
Get corpus positions for a query or queries.cpos cpos,character-method cpos,corpus-method cpos,hits-method cpos,matrix-method cpos,NULL-method cpos,partition-method cpos,slice-method cpos,subcorpus-method
Tools for CQP queries.as.cqp check_cqp_query cqp is.cqp
Decode corpus or subcorpus.decode decode,character-method decode,corpus-method decode,data.table-method decode,integer-method decode,partition-method decode,slice-method decode,subcorpus-method decode-method
Dispersion of a query or multiple queries.dispersion dispersion,character-method dispersion,corpus-method dispersion,hits-method dispersion,partition-method dispersion,remote_corpus-method dispersion,remote_subcorpus-method dispersion,slice-method dispersion,subcorpus-method
dotplotdotplot dotplot,features-method dotplot,features_ngrams-method dotplot,partition-method dotplot,textstat-method
Get and set encoding.encoding encoding,bundle-method encoding,call-method encoding,character-method encoding,corpus-method encoding,missing-method encoding,quosure-method encoding,subcorpus-method encoding,textstat-method encoding<- encoding<-,call-method encoding<-,quosure-method
Conversion between corpus and native encoding.as.corpusEnc as.nativeEnc as.utf8 encodings
Enrich an object.enrich enrich-method
Get features by comparison.features features,Cooccurrences-method features,count-method features,count_bundle-method features,ngrams-method features,partition-method features,partition_bundle-method
Feature selection by comparison.features-class features_bundle-class features_cooccurrences-class features_ngrams-class format,features-method kwic_bundle-class show,features-method summary,features-method summary,features_bundle-method view,features-method
Get template for formatting full text output.get_template get_template,character-method get_template,corpus-method get_template,subcorpus-method
Get Token Stream.get_token_stream get_token_stream,character-method get_token_stream,corpus-method get_token_stream,matrix-method get_token_stream,numeric-method get_token_stream,partition-method get_token_stream,partition_bundle-method get_token_stream,regions-method get_token_stream,slice-method get_token_stream,subcorpus-method
Get corpus/partition type.get_type get_type,character-method get_type,corpus-method get_type,partition_bundle-method get_type,subcorpus-method get_type,subcorpus_bundle-method
Highlight tokens in text output.highlight highlight,character-method highlight,html-method highlight,kwic-method highlight-method
Get hits for queryhits hits,character-method hits,context-method hits,corpus-method hits,partition-method hits,partition_bundle-method hits,remote_corpus-method hits,remote_subcorpus-method hits,subcorpus-method
S4 class to represent hits for queries.hits-class hits_class sample,hits-method
Add hypertext reference to html document.href href-function
Generate html from object.html html,character-method html,kwic-method html,partition-method html,partition_bundle-method html,remote_subcorpus-method html,subcorpus-method show,html-method
Check whether s-attributes of corpus are nestedis_nested
Perform keyword-in-context (KWIC) analysis.kwic kwic,character-method kwic,context-method kwic,corpus-method kwic,partition-method kwic,partition_bundle-method kwic,remote_corpus-method kwic,remote_partition-method kwic,remote_subcorpus-method kwic,slice-method kwic,subcorpus-method kwic,subcorpus_bundle-method
S4 kwic classas.character,kwic-method as.data.frame,kwic-method as.DocumentTermMatrix,kwic-method as.TermDocumentMatrix,kwic-method count,kwic-method enrich,kwic-method format,kwic-method get_corpus,kwic-method knit_print,kwic-method kwic-class length,kwic-method merge,kwic_bundle-method sample,kwic-method show,kwic-method subset,kwic-method view,kwic-method [,kwic,ANY,ANY,ANY-method [,kwic-method
Compute Log-likelihood Statistics.ll ll,context-method ll,Cooccurrences-method ll,cooccurrences-method ll,features-method
calculate meansmeans means,DocumentTermMatrix-method
Get N-Gramsngrams ngrams,character-method ngrams,corpus-method ngrams,data.table-method ngrams,list-method ngrams,partition-method ngrams,partition_bundle-method ngrams,subcorpus-method
Ngrams class.ngrams-class ngrams_class
detect noisenoise noise,character-method noise,DocumentTermMatrix-method noise,TermDocumentMatrix-method noise,textstat-method
Execute code on OpenCPU serverocpu_exec opencpu
Get p-attributes.p_attributes p_attributes,character-method p_attributes,corpus-method p_attributes,partition_bundle-method p_attributes,remote_corpus-method p_attributes,remote_partition-method p_attributes,slice-method
Initialize a partition.partition partition,character-method partition,context-method partition,corpus-method partition,environment-method partition,partition-method partition,remote_corpus-method partition,remote_partition-method
Generate bundle of partitions.partition_bundle partition_bundle,character-method partition_bundle,context-method partition_bundle,corpus-method partition_bundle,partition-method partition_bundle,partition_bundle-method
Bundle of partitions (partition_bundle class).+,partition_bundle,ANY-method +,partition_bundle,partition-method +,partition_bundle,partition_bundle-method +,partition_bundle-method as.matrix,partition_bundle-method as.partition_bundle,list-method barplot,partition_bundle-method enrich,partition_bundle-method enrich,subcorpus_bundle-method flatten merge,partition_bundle-method names,partition_bundle-method partition_bundle,environment-method partition_bundle-class show,partition_bundle-method summary,partition_bundle-method [,partition_bundle,ANY,ANY,ANY-method [,partition_bundle-method [[,partition_bundle-method
Partition class and methods.as.partition_bundle as.partition_bundle,partition-method as.regions,partition-method enrich,partition-method export export,partition-method is.partition partition-class partition_class plpr_partition-class press_partition-class p_attributes,partition-method p_attributes,subcorpus-method remote_partition-class show,partition-method split split,partition-method [,partition,ANY,ANY,ANY-method [,partition-method
Decode as String.partition_to_string
Manage and use phrasesas.character,phrases-method as.phrases as.phrases,matrix-method as.phrases,ngrams-method concatenate_phrases phrases phrases-class
Calculate Pointwise Mutual Information (PMI).pmi pmi,context-method pmi,Cooccurrences-method pmi,ngrams-method
Defunct functionalitybrowse mail polmineR-defunct
Generic methods defined in the polmineR packageget_info polmineR-generics show_info
Get ranges for query.ranges ranges,character-method ranges,corpus-method ranges,partition-method ranges,subcorpus-method
Ranges of query matches.as.data.table.ranges ranges-class
Display full text.read read,data.table-method read,hits-method read,kwic-method read,partition-method read,partition_bundle-method read,regions-method read,subcorpus-method
Regions of a CWB corpus.as.data.table.regions as.regions regions regions,corpus-method regions,subcorpus-method regions-class
Get registry and data directories.data_dir registry registry_move
Renamed Functionsas.partitionBundle corpus,bundle-method corpus,kwic-method corpus,textstat-method getEncoding getTerms getTokenStream partitionBundle pAttributes renamed sAttributes
Get s-attributes.s_attributes s_attributes,call-method s_attributes,character-method s_attributes,context-method s_attributes,corpus-method s_attributes,data.table-method s_attributes,name-method s_attributes,partition-method s_attributes,partition_bundle-method s_attributes,quosure-method s_attributes,remote_corpus-method s_attributes,remote_partition-method s_attributes,slice-method s_attributes,subcorpus-method
Get Number of Tokens.size size,character-method size,corpus-method size,DocumentTermMatrix-method size,features-method size,partition-method size,partition_bundle-method size,remote_corpus-method size,remote_partition-method size,slice-method size,TermDocumentMatrix-method
Virtual class slice.aggregate,slice-method slice slice-class
The S4 subcorpus class.get_corpus,subcorpus-method name<-,subcorpus-method plpr_subcorpus-class press_subcorpus-class remote_subcorpus-class size,subcorpus-method subcorpus subcorpus-class summary,subcorpus-method
Bundled subcorporamerge,subcorpus-method merge,subcorpus_bundle-method show,subcorpus_bundle-method split,corpus-method split,subcorpus-method split,subcorpus_bundle-method subcorpus_bundle-class
Subsetting corpora and subcorporasubset subset,character-method subset,corpus-method subset,remote_corpus-method subset,subcorpus-method subset,subcorpus_bundle-method subset-method
Perform t-test.t_test t_test,context-method
Get terms in 'partition' or 'corpus'.terms terms,character-method terms,corpus-method terms,partition-method terms,slice-method terms,subcorpus-method
S4 textstat superclass.+,textstat,textstat-method as.bundle as.data.frame,textstat-method as.data.table.textstat as.DataTables,textstat-method colnames,textstat-method cp dim,textstat-method format,textstat-method get_corpus,textstat-method head,textstat-method knit_print,textstat-method name name,character-method name,textstat-method name<- name<-,textstat-method names,textstat-method ncol,textstat-method nrow,textstat-method p_attributes,textstat-method restore round,textstat-method rownames,textstat-method show,textstat-method sort,textstat-method subset,textstat-method tail,textstat-method textstat-class view,textstat-method [,textstat,ANY,ANY,ANY-method [[,textstat-method
Add tooltips to text output.tooltips tooltips,character-method tooltips,html-method tooltips,kwic-method tooltips-method
Show the structure of s-attributestree_structure tree_structure,corpus-method tree_structure,subcorpus-method tree_structure,xml_document-method tree_structure,xml_node-method
Trim an object.punctuation trim trim,DocumentTermMatrix-method trim,TermDocumentMatrix-method trim-method
Add corpora in R data packages to session registry.use
Inspect object using View().view
Apply Weight to Matrixweigh weigh,count-method weigh,count_bundle-method weigh,DocumentTermMatrix-method weigh,TermDocumentMatrix-method