Package: polmineR 0.8.9
polmineR: Verbs and Nouns for Corpus Analysis
Package for corpus analysis using the Corpus Workbench ('CWB', <https://cwb.sourceforge.io>) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.
Authors:
polmineR_0.8.9.tar.gz
polmineR_0.8.9.zip(r-4.5)polmineR_0.8.9.zip(r-4.4)polmineR_0.8.9.zip(r-4.3)
polmineR_0.8.9.tgz(r-4.4-any)polmineR_0.8.9.tgz(r-4.3-any)
polmineR_0.8.9.tar.gz(r-4.5-noble)polmineR_0.8.9.tar.gz(r-4.4-noble)
polmineR_0.8.9.tgz(r-4.4-emscripten)polmineR_0.8.9.tgz(r-4.3-emscripten)
polmineR.pdf |polmineR.html✨
polmineR/json (API)
NEWS
# Install 'polmineR' in R: |
install.packages('polmineR', repos = c('https://polmine.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/polmine/polminer/issues
Last updated 1 years agofrom:842d4a6854. Checks:OK: 7. Indexed: yes.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Nov 06 2024 |
R-4.5-win | OK | Nov 06 2024 |
R-4.5-linux | OK | Nov 06 2024 |
R-4.4-win | OK | Nov 06 2024 |
R-4.4-mac | OK | Nov 06 2024 |
R-4.3-win | OK | Nov 06 2024 |
R-4.3-mac | OK | Nov 06 2024 |
Exports:%>%aggregateannotationsannotations<-as_igraphas.bundleas.corpusEncas.cqpas.data.frameas.data.tableas.DocumentTermMatrixas.listas.markdownas.matrixas.nativeEncas.partition_bundleas.partitionBundleas.phrasesas.regionsas.simple_triplet_matrixas.sparseMatrixas.speechesas.TermDocumentMatrixas.utf8as.VCorpusbarplotblapplybrowsecapitalizecheck_cqp_querychisquarecolnamesconcatenate_phrasescontextcooccurrencesCooccurrencescorpuscountcpcposdata_dirdecodedispersiondotploteditencodingencoding<-enrichfeaturesflattenformatget_corpusget_infoget_templateget_token_streamget_typegetEncodinggetTermsgetTokenStreamheadhighlighthisthitshrefhtmlis_nestedis.cqpis.partitionknit_printkwicllmailmergenamename<-ncolngramsnoisenrowocpu_execp_attributespartitionpartition_bundlepartitionBundlepAttributespmipolmineRpunctuationrangesreadregionsregistryregistry_get_encodingregistry_get_homeregistry_get_idregistry_get_inforegistry_get_nameregistry_get_p_attributesregistry_get_propertiesregistry_get_s_attributesregistry_moveregistry_resetrestores_attributessamplesAttributesshowshow_infosizesortsplitsubsetsummarytailtermstooltipstree_structuretrimuseviewweigh
Dependencies:base64encBHbslibcachemclicrosstalkdata.tabledigestDTevaluatefastmapfontawesomefsgluehighrhtmltoolshtmlwidgetshttpuvjquerylibjsonliteknitrlaterlatticelazyevallifecyclemagrittrMatrixmemoisemimeNLPpbapplypromisesR6rappdirsRcppRcppCWBrlangrmarkdownsassslamstringitinytextmxfunxml2yaml
Encodings
Rendered fromencodings.Rmd
usingknitr::rmarkdown
on Nov 06 2024.Last update: 2021-03-17
Started: 2021-03-17
Introducing the 'polmineR'-package
Rendered fromvignette.Rmd
usingknitr::rmarkdown
on Nov 06 2024.Last update: 2022-05-08
Started: 2017-04-19
OpenCPU
Rendered fromOpenCPU.Rmd
usingknitr::rmarkdown
on Nov 06 2024.Last update: 2021-02-08
Started: 2019-11-14
Readme and manuals
Help Manual
Help page | Topics |
---|---|
polmineR-package | polmineR-package polmineR |
Annotation functionality | annotations annotations,kwic-method annotations,textstat-method annotations<- annotations<-,kwic,list-method annotations<-,textstat,list-method edit,textstat-method |
Get markdown-formatted full text of a partition. | as.markdown as.markdown,partition-method as.markdown,plpr_partition-method as.markdown,plpr_subcorpus-method as.markdown,subcorpus-method |
Type conversion - get sparseMatrix. | as.sparseMatrix as.sparseMatrix,bundle-method as.sparseMatrix,DocumentTermMatrix-method as.sparseMatrix,simple_triplet_matrix-method as.sparseMatrix,TermDocumentMatrix-method |
Split corpus or partition into speeches. | as.speeches as.speeches,character-method as.speeches,corpus-method as.speeches,partition-method as.speeches,subcorpus-method |
Generate TermDocumentMatrix / DocumentTermMatrix. | as.DocumentTermMatrix as.DocumentTermMatrix,bundle-method as.DocumentTermMatrix,character-method as.DocumentTermMatrix,context-method as.DocumentTermMatrix,corpus-method as.DocumentTermMatrix,partition_bundle-method as.DocumentTermMatrix,subcorpus_bundle-method as.TermDocumentMatrix as.TermDocumentMatrix,bundle-method as.TermDocumentMatrix,character-method as.TermDocumentMatrix,context-method as.TermDocumentMatrix,partition_bundle-method as.TermDocumentMatrix,subcorpus_bundle-method |
Get VCorpus. | as as.VCorpus as.VCorpus,partition_bundle-method |
apply a function over a list or bundle | blapply blapply,bundle-method blapply,list-method blapply,vector-method |
Bundle Class | $,bundle-method $<-,bundle-method +,bundle,bundle-method +,bundle,textstat-method as.bundle,list-method as.bundle,textstat-method as.data.table.bundle as.list,bundle-method as.list.bundle as.matrix,bundle-method bundle bundle-class get_corpus,bundle-method length,bundle-method name<-,bundle-method names,bundle-method names<-,bundle,vector-method sample,bundle-method subset,bundle-method unique,bundle-method [,bundle,ANY,ANY,ANY-method [[,bundle-method [[<-,bundle-method |
Capitalize character vector. | capitalize |
Perform chisquare-text. | chisquare chisquare,context-method chisquare,cooccurrences-method chisquare,features-method |
Analyze context of a node word. | as.matrix,context_bundle-method context context,character-method context,cooccurrences-method context,corpus-method context,matrix-method context,partition-method context,partition_bundle-method context,slice-method context,subcorpus-method |
S4 context_bundle class | context_bundle-class show,context_bundle-method summary,context_bundle-method [,context_bundle,ANY,ANY,ANY-method [,context_bundle-method [[,context_bundle-method |
Context class. | as.DataTables,context-method as.regions,context-method context-class count,context-method enrich,context-method head,context-method length,context-method p_attributes,context-method sample,context-method show,context-method summary,context-method trim,context-method [,context,ANY,ANY,ANY-method [,context-method [[,context-method |
Get cooccurrence statistics. | cooccurrences cooccurrences,character-method cooccurrences,context-method cooccurrences,Cooccurrences-method cooccurrences,corpus-method cooccurrences,partition-method cooccurrences,partition_bundle-method cooccurrences,remote_corpus-method cooccurrences,remote_subcorpus-method cooccurrences,slice-method cooccurrences,subcorpus-method |
Cooccurrences class. | as.data.frame,cooccurrences_bundle-method cooccurrences-class cooccurrences_bundle cooccurrences_bundle-class cooccurrences_reshaped-class format,cooccurrences-method show,cooccurrences-method view,cooccurrences-method view,cooccurrences_reshaped-method |
Cooccurrences class for corpus/partition. | as.simple_triplet_matrix,Cooccurrences-method as.sparseMatrix,Cooccurrences-method as_igraph as_igraph,Cooccurrences-method Cooccurrences-class decode,Cooccurrences-method enrich,Cooccurrences-method kwic,Cooccurrences-method subset,Cooccurrences-method |
Get all cooccurrences in corpus/partition. | Cooccurrences Cooccurrences,character-method Cooccurrences,corpus-method Cooccurrences,partition-method Cooccurrences,slice-method Cooccurrences,subcorpus-method |
Corpus class initialization | corpus corpus,character-method corpus,missing-method corpus-class get_corpus remote_corpus remote_corpus-class zoom |
Corpus class methods | $,corpus-method corpus-methods get_corpus,corpus-method get_info,corpus-method name,corpus-method show,corpus-method show_info,corpus-method |
Get counts. | count count,character-method count,corpus-method count,partition-method count,partition_bundle-method count,remote_corpus-method count,remote_subcorpus-method count,subcorpus-method count,subcorpus_bundle-method count,vector-method count-method |
Count class. | count-class count_bundle-class count_class hist,count-method length,count-method summary,count-method |
Get corpus positions for a query or queries. | cpos cpos,character-method cpos,corpus-method cpos,hits-method cpos,matrix-method cpos,NULL-method cpos,partition-method cpos,slice-method cpos,subcorpus-method |
Tools for CQP queries. | as.cqp check_cqp_query cqp is.cqp |
Decode corpus or subcorpus. | decode decode,character-method decode,corpus-method decode,data.table-method decode,integer-method decode,partition-method decode,slice-method decode,subcorpus-method decode-method |
Dispersion of a query or multiple queries. | dispersion dispersion,character-method dispersion,corpus-method dispersion,hits-method dispersion,partition-method dispersion,remote_corpus-method dispersion,remote_subcorpus-method dispersion,slice-method dispersion,subcorpus-method |
dotplot | dotplot dotplot,features-method dotplot,features_ngrams-method dotplot,partition-method dotplot,textstat-method |
Get and set encoding. | encoding encoding,bundle-method encoding,call-method encoding,character-method encoding,corpus-method encoding,missing-method encoding,quosure-method encoding,subcorpus-method encoding,textstat-method encoding<- encoding<-,call-method encoding<-,quosure-method |
Conversion between corpus and native encoding. | as.corpusEnc as.nativeEnc as.utf8 encodings |
Enrich an object. | enrich enrich-method |
Get features by comparison. | features features,Cooccurrences-method features,count-method features,count_bundle-method features,ngrams-method features,partition-method features,partition_bundle-method |
Feature selection by comparison. | features-class features_bundle-class features_cooccurrences-class features_ngrams-class format,features-method kwic_bundle-class show,features-method summary,features-method summary,features_bundle-method view,features-method |
Get template for formatting full text output. | get_template get_template,character-method get_template,corpus-method get_template,subcorpus-method |
Get Token Stream. | get_token_stream get_token_stream,character-method get_token_stream,corpus-method get_token_stream,matrix-method get_token_stream,numeric-method get_token_stream,partition-method get_token_stream,partition_bundle-method get_token_stream,regions-method get_token_stream,slice-method get_token_stream,subcorpus-method |
Get corpus/partition type. | get_type get_type,character-method get_type,corpus-method get_type,partition_bundle-method get_type,subcorpus-method get_type,subcorpus_bundle-method |
Highlight tokens in text output. | highlight highlight,character-method highlight,html-method highlight,kwic-method highlight-method |
Get hits for query | hits hits,character-method hits,context-method hits,corpus-method hits,partition-method hits,partition_bundle-method hits,remote_corpus-method hits,remote_subcorpus-method hits,subcorpus-method |
S4 class to represent hits for queries. | hits-class hits_class sample,hits-method |
Add hypertext reference to html document. | href href-function |
Generate html from object. | html html,character-method html,kwic-method html,partition-method html,partition_bundle-method html,remote_subcorpus-method html,subcorpus-method show,html-method |
Check whether s-attributes of corpus are nested | is_nested |
Perform keyword-in-context (KWIC) analysis. | kwic kwic,character-method kwic,context-method kwic,corpus-method kwic,partition-method kwic,partition_bundle-method kwic,remote_corpus-method kwic,remote_partition-method kwic,remote_subcorpus-method kwic,slice-method kwic,subcorpus-method kwic,subcorpus_bundle-method |
S4 kwic class | as.character,kwic-method as.data.frame,kwic-method as.DocumentTermMatrix,kwic-method as.TermDocumentMatrix,kwic-method count,kwic-method enrich,kwic-method format,kwic-method get_corpus,kwic-method knit_print,kwic-method kwic-class length,kwic-method merge,kwic_bundle-method sample,kwic-method show,kwic-method subset,kwic-method view,kwic-method [,kwic,ANY,ANY,ANY-method [,kwic-method |
Compute Log-likelihood Statistics. | ll ll,context-method ll,Cooccurrences-method ll,cooccurrences-method ll,features-method |
calculate means | means means,DocumentTermMatrix-method |
Get N-Grams | ngrams ngrams,character-method ngrams,corpus-method ngrams,data.table-method ngrams,list-method ngrams,partition-method ngrams,partition_bundle-method ngrams,subcorpus-method |
Ngrams class. | ngrams-class ngrams_class |
detect noise | noise noise,character-method noise,DocumentTermMatrix-method noise,TermDocumentMatrix-method noise,textstat-method |
Execute code on OpenCPU server | ocpu_exec opencpu |
Get p-attributes. | p_attributes p_attributes,character-method p_attributes,corpus-method p_attributes,partition_bundle-method p_attributes,remote_corpus-method p_attributes,remote_partition-method p_attributes,slice-method |
Initialize a partition. | partition partition,character-method partition,context-method partition,corpus-method partition,environment-method partition,partition-method partition,remote_corpus-method partition,remote_partition-method |
Generate bundle of partitions. | partition_bundle partition_bundle,character-method partition_bundle,context-method partition_bundle,corpus-method partition_bundle,partition-method partition_bundle,partition_bundle-method |
Bundle of partitions (partition_bundle class). | +,partition_bundle,ANY-method +,partition_bundle,partition-method +,partition_bundle,partition_bundle-method +,partition_bundle-method as.matrix,partition_bundle-method as.partition_bundle,list-method barplot,partition_bundle-method enrich,partition_bundle-method enrich,subcorpus_bundle-method flatten merge,partition_bundle-method names,partition_bundle-method partition_bundle,environment-method partition_bundle-class show,partition_bundle-method summary,partition_bundle-method [,partition_bundle,ANY,ANY,ANY-method [,partition_bundle-method [[,partition_bundle-method |
Partition class and methods. | as.partition_bundle as.partition_bundle,partition-method as.regions,partition-method enrich,partition-method export export,partition-method is.partition partition-class partition_class plpr_partition-class press_partition-class p_attributes,partition-method p_attributes,subcorpus-method remote_partition-class show,partition-method split split,partition-method [,partition,ANY,ANY,ANY-method [,partition-method |
Decode as String. | partition_to_string |
Manage and use phrases | as.character,phrases-method as.phrases as.phrases,matrix-method as.phrases,ngrams-method concatenate_phrases phrases phrases-class |
Calculate Pointwise Mutual Information (PMI). | pmi pmi,context-method pmi,Cooccurrences-method pmi,ngrams-method |
Defunct functionality | browse mail polmineR-defunct |
Generic methods defined in the polmineR package | get_info polmineR-generics show_info |
Get ranges for query. | ranges ranges,character-method ranges,corpus-method ranges,partition-method ranges,subcorpus-method |
Ranges of query matches. | as.data.table.ranges ranges-class |
Display full text. | read read,data.table-method read,hits-method read,kwic-method read,partition-method read,partition_bundle-method read,regions-method read,subcorpus-method |
Regions of a CWB corpus. | as.data.table.regions as.regions regions regions,corpus-method regions,subcorpus-method regions-class |
Get registry and data directories. | data_dir registry registry_move |
Renamed Functions | as.partitionBundle corpus,bundle-method corpus,kwic-method corpus,textstat-method getEncoding getTerms getTokenStream partitionBundle pAttributes renamed sAttributes |
Get s-attributes. | s_attributes s_attributes,call-method s_attributes,character-method s_attributes,context-method s_attributes,corpus-method s_attributes,data.table-method s_attributes,name-method s_attributes,partition-method s_attributes,partition_bundle-method s_attributes,quosure-method s_attributes,remote_corpus-method s_attributes,remote_partition-method s_attributes,slice-method s_attributes,subcorpus-method |
Get Number of Tokens. | size size,character-method size,corpus-method size,DocumentTermMatrix-method size,features-method size,partition-method size,partition_bundle-method size,remote_corpus-method size,remote_partition-method size,slice-method size,TermDocumentMatrix-method |
Virtual class slice. | aggregate,slice-method slice slice-class |
The S4 subcorpus class. | get_corpus,subcorpus-method name<-,subcorpus-method plpr_subcorpus-class press_subcorpus-class remote_subcorpus-class size,subcorpus-method subcorpus subcorpus-class summary,subcorpus-method |
Bundled subcorpora | merge,subcorpus-method merge,subcorpus_bundle-method show,subcorpus_bundle-method split,corpus-method split,subcorpus-method split,subcorpus_bundle-method subcorpus_bundle-class |
Subsetting corpora and subcorpora | subset subset,character-method subset,corpus-method subset,remote_corpus-method subset,subcorpus-method subset,subcorpus_bundle-method subset-method |
Perform t-test. | t_test t_test,context-method |
Get terms in 'partition' or 'corpus'. | terms terms,character-method terms,corpus-method terms,partition-method terms,slice-method terms,subcorpus-method |
S4 textstat superclass. | +,textstat,textstat-method as.bundle as.data.frame,textstat-method as.data.table.textstat as.DataTables,textstat-method colnames,textstat-method cp dim,textstat-method format,textstat-method get_corpus,textstat-method head,textstat-method knit_print,textstat-method name name,character-method name,textstat-method name<- name<-,textstat-method names,textstat-method ncol,textstat-method nrow,textstat-method p_attributes,textstat-method restore round,textstat-method rownames,textstat-method show,textstat-method sort,textstat-method subset,textstat-method tail,textstat-method textstat-class view,textstat-method [,textstat,ANY,ANY,ANY-method [[,textstat-method |
Add tooltips to text output. | tooltips tooltips,character-method tooltips,html-method tooltips,kwic-method tooltips-method |
Show the structure of s-attributes | tree_structure tree_structure,corpus-method tree_structure,subcorpus-method tree_structure,xml_document-method tree_structure,xml_node-method |
Trim an object. | punctuation trim trim,DocumentTermMatrix-method trim,TermDocumentMatrix-method trim-method |
Add corpora in R data packages to session registry. | use |
Inspect object using View(). | view |
Apply Weight to Matrix | weigh weigh,count-method weigh,count_bundle-method weigh,DocumentTermMatrix-method weigh,TermDocumentMatrix-method |